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Abstract 

A  graph  is  a  key  construct  for  expressing  relationships  among  objects,  such  as  the 
radio  connectivity  between  nodes  contained  in  an  unmanned  vehicle  swarm.  The  study  of 
such  networks  may  include  ranking  nodes  based  on  importance,  for  example,  by  applying 
the  PageRank  algorithm  used  in  some  search  engines  to  order  their  query  responses.  The 
PageRank  values  correspond  to  a  unique  eigenvector  typically  computed  by  applying  the 
power  method,  an  iterative  technique  based  on  matrix  multiplication. 

The  first  new  result  described  herein  is  a  lower  bound  on  the  execution  time  of  the 
PageRank  algorithm  that  is  derived  by  applying  standard  assumptions  to  the  scaling  value 
and  numerical  precision  used  to  determine  the  PageRank  vector.  The  lower  bound  on  the 
PageRank  algorithm’s  execution  time  also  equals  the  time  needed  to  compute  the  coarsest 
equitable  partition,  where  that  partition  is  the  basis  of  all  other  results  described  herein. 

The  second  result  establishes  that  nodes  contained  in  the  same  block  of  a  coarsest 
equitable  partition  must  yield  equal  PageRank  values.  The  third  result  is  an  algorithm  that 
eliminates  differences  in  the  PageRank  values  of  nodes  contained  in  the  same  block  if  the 
PageRank  values  are  computed  using  finite-precision  arithmetic.  The  fourth  result  is  an 
algorithm  that  reduces  the  time  needed  to  find  the  PageRank  vector  by  eliminating  certain 
dot  products  when  any  block  in  the  partition  contains  multiple  vertices.  The  fifth  result  is 
an  algorithm  that  further  reduces  the  time  required  to  obtain  the  PageRank  vector  of  such 
graphs  by  applying  the  quotient  matrix  induced  by  the  coarsest  equitable  partition.  Each 
algorithm’s  complexity  is  derived  with  respect  to  the  number  of  blocks  contained  in  the 
coarsest  equitable  partition  and  compared  to  the  PageRank  algorithm’s  complexity. 


IV 


These  results  further  existing  research  in  several  ways.  For  instance,  the  practical 
lower  bound  on  the  PageRank  algorithm’s  execution  time  was  previously  only  suggested 
using  experimental  results.  The  proof  showing  vertices  contained  in  the  same  block  of  the 
coarsest  equitable  partition  have  equal  PageRank  values  is  based  on  relating  dot  products 
and  Weisfeiler-Lehman  stabilization,  which  is  a  much  different  approach  than  applied  in 
an  existing  proof  The  existing  proof  was  also  extended  to  show  the  quotient  matrix  could 
be  used  to  reduce  the  PageRank  algorithm’s  execution  time.  However,  its  authors  did  not 
develop  an  algorithm  or  analyze  its  execution  time  bounds.  Finally,  these  results  motivate 
several  avenues  of  future  research  related  to  graph  isomorphism  and  linear  algebra. 
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ON  GRAPH  ISOMORPHISM  AND  THE  PAGERANK  ALGORITHM 


I.  Introduction 

1.1.  The  PageRank  Vector 

A  graph  is  a  useful  construct  for  expressing  relationships  between  a  set  of  objects, 
such  as  the  radio  connectivity  among  nodes  in  an  unmanned  aerial  vehicle  (UAV)  swarm. 
Some  analysis  tools  use  a  measure  of  node  centrality  to  rank  a  graph’s  vertices,  such  as  a 
UAV  swarm’s  nodes,  by  their  relative  importance.  For  instance,  the  PageRank  algorithm 
is  used  in  certain  search  engines  to  rank  query  responses  [PBM+98]. 

The  algorithm  produces  a  unique  eigenvector,  the  PageRank  vector,  whose  entries 
equal  the  probability  each  node  is  visited  by  an  object  that  randomly  traverses  the  graph. 
The  PageRank  algorithm  uses  the  PageRank  vector,  which  is  guaranteed  to  exist,  to  order 
a  graph’s  vertices,  such  as  a  UAV  swarm’s  nodes,  according  to  their  probability  of  being 
randomly  visited.  The  results  described  in  Chapters  3-5  reduce  the  time  needed  to  obtain 
a  swarm’s  PageRank  vector  if  the  swarm  contains  nodes  having  equal  PageRank  values. 

In  the  context  of  UAV  swarms,  the  PageRank  vector  can  be  used  to  identify  where 
to  inject  a  message  to  ensure  it  is  efficiently  disseminated  among  the  swarm’s  nodes  by  a 
rumor-routing  protocol.  In  social  network  analysis,  a  PageRank  vector  can  be  used  to  find 
group  members  that  are  useful  for  efficiently  spreading  (mis)information.  The  PageRank 
vector  can  also  be  used  to  find  roadblock  locations  for  capturing  fleeing  suspects.  A  more 
sedate  application  determines  which  road  intersections  to  avoid  during  a  city’s  rush  hour. 
In  general,  the  PageRank  vector,  which  is  computed  by  the  PageRank  algorithm,  is  useful 
for  analyzing  the  probable  behavior  of  some  object  traversing  a  network’s  nodes,  such  as 
some  (mis)information  distributed  by  a  rumor-routing  protocol  in  a  UAV  swarm. 
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1.2.  Research  Motivation 

The  PageRank  vector  often  canonically  orders  the  graph’s  vertices.  For  example, 
the  mansion  graph  shown  in  Figure  1(a)  [CDR07]  yields  the  PageRank  vector  shown  in 
Table  1(a).  Sorting  the  entries  of  this  vector  in  descending  order  yields  the  vector  listed  in 
Table  1(b).  The  vertex  ordering  induced  by  the  sorted  vector  is  illustrated  in  Figure  1(b). 

The  PageRank  vector  is  unique  up  to  graph  isomorphism,  where  graphs  are  said  to 
be  isomorphs  if  their  edges  define  equivalent  relationships  on  their  vertices.  Since  entries 
in  the  mansion  graph’s  PageRank  vector  are  distinct,  all  mansion  graph  isomorphs  induce 
the  same  vertex  ordering,  i.e.,  the  canonical  ordering  shown  in  Table  1(b).  For  example, 
the  isomorph  shown  in  Figure  1(c)  yields  the  sorted  PageRank  vector  listed  in  Table  1(c), 
which  is  equivalent  to  the  ordering  listed  in  Table  1(b)  and  illustrated  in  Figure  1(b). 


(a)  Mansion  Graph  (b)  PageRank  Order  (c)  Another  Isomorph 

Figure  1.  Mansion  Graph  [CDR07] 


Table  1.  Mansion  Graph’s  PageRank  Vector 
(a)  PageRank  Vector  (b)  PageRank  Order  (c)  Isomorph’s  Order 


a 

0.126 

b 

0.236 

c 

0.195 

d 

0.182 

e 

0.180 

f 

0.080 

0.236 

a 

0.195 

b 

0.182 

c 

0.180 

d 

0.126 

e 

0.080 

f 

0.236 

b 

0.195 

c 

0.182 

d 

0.180 

e 

0.126 

a 

0.080 

f 
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However,  a  PageRank  vector  may  contain  duplicate  entries,  which  cannot  induce 
a  canonical  ordering.  The  issue  can  be  resolved  in  some  applications,  e.g.,  search  engines, 
by  sorting  on  other  keys,  such  as  the  data’s  original  location.  A  second  method  is  to  order 
vertices  of  a  canonical  isomorph  according  to  its  PageRank  vector.  For  instance,  nauty  is 
often  used  to  determine  such  a  canonical  isomorph  [McK81,  McK04]. 

For  example,  the  house  graph  shown  in  Figure  2(a)  [G0I8O,  Gol04],  which  yields 
the  PageRank  vector  listed  in  Table  2(a).  The  house  graph’s  canonical  isomorph  produced 
by  nauty  is  listed  in  Figure  2(b)  and  yields  the  PageRank  vector  listed  in  Table  2(b).  The 
vertex  mapping,  j  <^r  ^  s  denotes  vertex  y.,  labeled  r,  is  mapped  to  the  vertex  Vj, 

labeled  s.  The  canonical  isomorph  induces  the  canonical  ordering  listed  in  Table  2(c)  and 
depicted  in  Figure  2(c).  The  tie  in  PageRank  value  between  vertices  b  and  e,  or  similarly, 
vertices  c  and  d,  are  broken  by  their  relative  order  in  the  canonical  isomorph. 


(a)  House  Graph  (b)  Canonical  Isomorph  (c)  PageRank  Order 


Figure  2.  House  Graph  [G0I8O,  Gol04] 

Table  2.  House  Graph’s  PageRank  Vector 
(a)  House  Graph  (b)  Canonical  Isomorph  (c)  PageRank  Order 


1  — ^  3  Cl  — ^  c 

0.168 

2  — ^  4  b  — ^  d 

0.244 

3  — ^  1  c  — ^  Cl 

0 .172 

4  — ^  2  d  — ^  b 

0 . 172 

5  — ^  5  c  — ^  c 

0.244 

a 

0 .168 

b 

0.244 

c 

0 . 172 

d 

0 . 172 

e 

0.244 

0.244 

d 

0.244 

e 

0.172 

a 

0 . 172 

b 

0.168 

c 
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1.3.  Problem  Statement 

The  PageRank  vector  of  many  graphs,  such  as  the  mansion  graph,  contains  unique 
entries  and  induces  a  canonical  vertex  ordering.  In  contrast,  the  PageRank  vector  of  some 
graphs,  such  as  the  house  graph,  contains  duplicate  entries  and  cannot  induce  a  canonical 
vertex  ordering.  However,  graphs  containing  vertices  that  yield  the  same  PageRank  value 
suggest  several  methods  of  improving  the  PageRank  algorithm’s  performance. 

One  avenue  to  improvement  is  yielded  by  the  graph’s  coarsest  equitable  partition, 
an  invariant  often  used  in  applications  that  find  canonical  isomorphs,  e.g.,  nauty.  Vertices 
contained  in  the  same  block  are  adjacency-wise  equivalent,  or  equitable,  with  respect  to 
the  vertices  in  other  blocks  and  more  importantly,  appear  to  yield  equal  PageRank  values. 

For  example,  the  house  graph’s  coarsest  equitable  partition  is  {a},  {Z),e}],  which 

is  depicted  using  distinct  shapes  in  Figure  3.  The  PageRank  values  yielded  by  the  vertices 
contained  in  each  block  are  [0.172, 0.168, 0.244],  respectively,  corresponding  to  Table  2. 


The  first  task  is  to  show  a  relationship  exists  between  PageRank  values  of  vertices 
contained  in  the  same  block.  If  a  relationship  exists,  an  algorithm  must  be  developed  that 
ensures  vertices  contained  in  the  same  block,  e.g.,  b  and  e,  have  equal  PageRank  values, 
within  an  arbitrary  finite  precision.  The  last  task  is  to  construct  algorithms  that  reduce  the 
execution  time  needed  to  compute  the  PageRank  vector  of  such  graphs. 


0 - (7) 

Figure  3.  House  Graph’s  3-Block  Coarsest  Equitable  Partition,  {a},  {6,e}] 
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1.4.  Research  Goals 

The  first  goal  is  to  obtain  a  lower  bound  on  the  PageRank  algorithm’s  complexity 
by  applying  recent  work  that  determined  its  upper  bound  [HaK03].  The  second  goal  is  to 
establish  a  relationship  between  the  coarsest  equitable  partition  and  the  PageRank  vector. 
The  key  insight  is  obtained  by  constructing  a  modified  dot  product  process  that  performs 
1-D  Weisfeiler-Lehman  stabilization,  which  yields  the  coarsest  equitable  partition.  Since 
PageRank  values  can  be  obtained  by  applying  the  power  method,  which  is  an  iterated  dot 
product  process,  vertices  contained  in  the  same  block  of  the  coarsest  equitable  partition 
must  yield  equal  PageRank  values.  Another  proof  of  the  relationship  between  the  coarsest 
equitable  partition  and  PageRank  vectors  was  developed  by  Boldi  et  al.  [BLS+06]. 

The  third  goal  is  to  exploit  the  relationship  between  the  PageRank  vector  and  the 
coarsest  equitable  partition  to  eliminate  differences  between  PageRank  values  of  vertices 
contained  in  the  same  block  of  the  coarsest  equitable  partition,  where  such  differences 
may  occur  if  PageRank  values  are  computed  using  finite  numerical  precision.  The  fourth, 
and  most  important,  goal  is  to  reduce  the  time  needed  to  obtain  the  PageRank  vector.  The 
third  and  fourth  goals  are  achieved  by  the  three  algorithms  described  in  Chapters  4  and  5. 

The  first  algorithm  described  in  Chapter  4,  AverageRank,  sets  the  PageRank  value 
of  every  vertex  to  the  average  PageRank  value  of  the  vertices  in  its  corresponding  block. 
The  second  algorithm,  ProductRank,  reduces  the  time  needed  to  find  the  PageRank  vector 
by  only  computing  a  subset  of  the  dot  products  defined  in  each  power  method  iteration 
and  ensures  each  block’s  vertices  have  equal  PageRank  values.  Hence,  the  ProductRank 
algorithm  supersedes  the  AverageRank  algorithm,  since  it  computes  the  PageRank  vector 
more  efficiently  if  the  graph’s  coarsest  equitable  partition  is  non-discrete. 
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The  third  and  most  notable  algorithm,  QuotientRank,  is  described  in  Chapter  5. 
This  algorithm  obtains  the  PageRank  vector  by  obtaining  the  dominant  eigenvector  of  the 
quotient  graph  induced  by  the  coarsest  equitable  partition.  The  quotient  graph’s  vertices 
correspond  to  a  partition’s  blocks  and  the  edge  weights  denote  the  number  of  edges  from 
any  source  block  vertex  to  each  destination  block’s  vertices  [God93,  GoROl,  McK04]. 

For  example,  the  house  graph  depicted  in  Figure  4(a)  yields  the  coarsest  equitable 

partition,  [|a},{c,(i},{6,e}],  which  induces  the  quotient  graph  illustrated  in  Figure  4(b). 

Since  vertex  a  is  connected  to  vertices  b  and  e,  an  edge  of  weight  ‘2’  proceeds  from  block 
{a}  to  block  [b,eY  Similarly,  vertices  b  and  e  are  linked,  as  are  vertices  c  and  d,  thus,  a 

loop  of  weight  ‘1’  is  attached  to  blocks  {b,e^  and  {c,  J}.  Every  vertex  in  block  {b,e^  is 

connected  to  a  vertex  in  block  {c,  J} ,  thus,  an  edge  of  weight  ‘  1  ’  links  these  blocks.  This 

quotient  graph  yields  the  same  PageRank  ordering  illustrated  in  Table  2. 

Boldi  et  al.  showed  a  quotient  graph  can  reduce  the  time  required  to  compute  the 
PageRank  vector,  but  did  not  construct  or  analyze  any  such  method  [BLS+06].  Another 
proof  of  that  result  is  developed  in  Section  5.2.  The  remainder  of  Chapter  5  describes  and 
analyzes  the  QuotientRank  algorithm,  which  uses  a  certain  quotient  matrix  to  reduce  the 
time  needed  to  compute  the  PageRank  vector. 


(a)  House  Graph 


(b)  Induced  Quotient  Graph 


Figure  4.  House  Graph  and  Its  Induced  Quotient  Graph 
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1.5.  Assumptions 

A  common  assumption,  which  is  applied  herein,  is  that  each  input  graph  is  simple. 
That  is,  an  undirected  edge  links  two  distinct  vertices,  and  each  vertex  pair  is  linked  by  at 
most  a  single  edge.  For  example,  the  house  graph  shown  in  Figure  4(a)  is  a  simple  graph. 
Conversely,  the  quotient  graph  induced  by  the  graph’s  coarsest  equitable  partition  is  often 
a  weighted  directed  graph,  as  shown  in  Figure  4(b).  The  simple  graph  assumption  is  only 
applied  for  convenience — the  results  described  herein  can  be  generalized  to  other  classes 
of  graphs,  such  as  directed  graphs  and  weighted  graphs. 

Another  conventional  assumption  used  herein  is  that  an  input  graphs  is  connected. 
Thus,  each  vertex  can  be  reached  from  any  other  vertex  by  traversing  one  or  more  edges. 
For  example,  the  house  graph  is  connected,  since  non-adjacent  vertices  can  be  reached  by 
simply  traversing  two  or  more  edges,  e.g.,  a— orZ?— >c— Conversely,  the 
graph  depicted  in  Figure  5  is  not  connected,  since  it  contains  two  connected  components, 
a  triangle  and  a  square. 


Figure  5.  Non-Connected  Graph 


If  a  graph  contains  n  vertices,  any  permutation  applied  to  the  graph  must  contain  n 
distinct  elements,  e.g.,  V  =  [a,b,c,d,e\  and  ^  =  [3,2,1, 4,5].  Finally,  any  phrases  such  as 

“up  to  a  permutation”  or  “up  to  isomorphism”  means  that  the  particular  graph  property  is 
similarly  permuted  by  a  vertex  permutation.  For  example,  a  PageRank  vector  is  one  such 
property,  as  reflected  by  the  house  graph  isomorphs  shown  in  Figure  2  and  Table  2. 
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1.6.  Overview 

The  constructs  presented  in  this  chapter,  such  as  the  coarsest  equitable  partition 
and  the  quotient  graph,  are  formally  defined  in  Chapter  2,  along  with  concepts  such  as  the 
orbit  partition.  The  next  few  sections  describe  the  PageRank  algorithm  and  its  use  of  the 
power  method.  Chapter  2’s  last  section  describes  some  related  results,  such  as  the  work  of 
Boldi  et  al.,  who  independently  established  the  relationship  between  the  graph’s  coarsest 
equitable  partition  and  PageRank  vector  by  constructing  an  alternate  proof  [BLS+06]. 

Chapter  3  begins  by  deriving  a  practical  lower  bound  on  the  execution  time  of  the 
PageRank  algorithm.  The  remainder  is  devoted  to  establishing  a  relationship  between  the 
coarsest  equitable  partition  and  the  PageRank  vector.  This  relationship  yields  two  simple 
methods  described  in  Chapter  4  that  improve  the  PageRank  algorithm’s  performance.  The 
first,  the  AverageRank  algorithm,  eliminates  PageRank  value  differences  among  vertices 
in  the  same  block.  The  second,  the  ProductRank  algorithm,  decreases  the  time  required  to 
obtain  a  PageRank  vector  by  computing  a  single  PageRank  value  for  each  block.  A  more 
dramatic  reduction  is  yielded  by  the  QuotientRank  algorithm  described  in  Chapter  5, 
which  lifts  a  PageRank  vector  from  the  quotient  matrix  and  further  extends  [BLS+06]. 

As  summarized  in  Chapter  6,  these  algorithms  improve  the  PageRank  algorithm’s 
performance  if  a  graph’s  coarsest  equitable  partition  contains  at  least  one  block  composed 
of  multiple  vertices.  One  section  in  Chapter  6  explores  potential  future  research  avenues. 
Some  of  these  avenues  include  generalizing  the  proof  relating  a  graph’s  coarsest  equitable 
partition  and  PageRank  vector,  constructing  a  graph  library  based  on  that  generalization, 
and  developing  functions  that  improve  the  performance  of  additional  linear  algebra  tasks 
using  the  techniques  described  in  Chapters  3-5. 
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II.  Background 


2.1.  Ordering  Nodes  in  Sensor  Networks  and  Unmanned  Vehicle  Swarms 

A  distributed  sensor  network  (DSN)  is  a  heterogeneous  set  of  nodes  for  collecting 

data  in  a  specified  environment.  Many  proposed  DSN  applications,  such  as  an  earthquake 
detection  network,  may  interact  with  infrastructure  systems,  e.g.,  moving  elevators  to  the 
first  floor  [E1E04].  DSNs  currently  monitor  glaciers  [Gui05],  animal  species  [SOP+04], 
and  road  traffic  conditions  [Tra07].  Some  military  applications  include  monitoring  soldier 
health  using  (small!)  ingested  sensors  or  sensors  sewn  into  uniforms  [TBH+04],  locating 
snipers  using  either  fixed  or  helmet-mounted  sensors  [LNV+05,  DBC+98]  and  detecting 
radiation  [BMT-l-04]. 

An  unmanned  aerial  vehicle  (UAV)  swarm  is  a  heterogeneous  network  of  mobile 
vehicles  whose  sensors  can  collect  multi-dimensional  data  [lyBOS,  ZhG04].  UAV  swarms 
may  collaborate,  e.g.,  by  cooperatively  searching  a  geographic  area  for  objects  matching 
some  search  criteria  [MMP-i-06,  Mor06,  PBO03].  Related  efforts  include  automated  aerial 
refueling  [Spi06],  attacking  targets  using  munitions  deployed  from  an  unmanned  combat 
aerial  vehicle  (UCAV)  [ClaOO,  JAU,  OSD05,  US  AOS],  and  decreasing  payload  weight  by 
using  more  advanced  sensors  [Rig03]. 

A  natural  problem  to  consider  involves  ordering  nodes  based  on  some  measure  of 
relative  importance.  An  attacker  who  knows  the  T*,  2"‘’, . . .,  n*  most  critical  network  node 
knows  where  to  expend  the  most  resources.  Some  related  problems  include  finding  nodes 
to  facilitate  spreading  malicious  data  (in  social  networks,  diseases  or  rumors),  computing 
an  optimal  node  polling  order  (traveling  salesman),  adding  network  services  (upgrading), 
or  sequencing  nodes  for  transmission  (logistics  planning). 
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As  described  in  Section  1 .2,  an  ideal  node  ordering  is  canonical  and  independent 
of  the  node  input  order.  Alternatively  stated,  if  the  graphs  of  two  networks  are  isomorphs, 
the  networks  ideally  yield  the  same  T*,  2"'’, . . canonical  node  ordering.  For  example, 
a  simple  heuristic  orders  nodes  based  on  their  number  of  neighbors.  The  node  linked  to 
the  greatest  number  of  nodes  is  first,  then  the  node  linked  to  the  second-most  number  of 
nodes,  and  so  forth.  However,  since  every  node  may  have  the  same  number  of  neighbors, 
such  a  node  ordering  is  typically  not  canonical.  Similarly,  sorting  the  nodes  based  on  their 
relative  geographic  positions  also  may  not  define  a  canonical  ordering.  Thus,  more  robust 
methods  of  ordering  nodes  are  typically  needed  to  obtain  a  canonical  node  ordering. 

One  robust  method  of  ordering  a  network’s  nodes  based  on  relative  importance  is 
the  PageRank  algorithm  used  in  some  search  engines  to  order  query  responses,  e.g.,  the 
web  pages  matching  a  user’s  search  criteria  [PBM-i-98].  The  PageRank  algorithm  perturbs 
the  adjacency  matrix  corresponding  to  the  original  network  such  that  the  resulting  matrix 
specifies  the  probability  of  visiting  each  node  from  any  other  node.  The  perturbed  matrix 
satisfies  the  Perron-Frobenius  theorem’s  conditions.  Therefore,  the  matrix  yields  a  unique 
dominant  eigenvector  whose  entries  correspond  to  the  probability  of  visiting  each  node. 

The  PageRank  algorithm  orders  the  nodes,  e.g.,  the  query  responses,  based  on  this 
eigenvector’s  entries,  which  correspond  to  the  stationary  distribution  of  the  Markov  chain 
defined  by  the  perturbed  adjacency  matrix.  Thus,  the  PageRank  algorithm  finds  a  unique 
vector,  the  PageRank  vector,  where  a  node’s  PageRank  value  also  determines  its  position 
in  the  node  ordering.  This  ordering  corresponds  to  the  order  nodes  are  likely  to  be  visited 
by  an  object  that  randomly  selects  the  next  node  to  visit,  e.g.,  a  user  who  randomly  surfs 
web  pages  or  message  distributed  using  a  rumor-routing  protocol. 
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In  the  context  of  an  unmanned  aerial  vehicle  (UAV)  swarm,  the  PageRank  vector 
can  be  used  to  determine  where  to  inject  a  message  in  the  swarm  to  ensure  the  message  is 
quickly  transmitted  to  each  node  by  a  rumor-routing  protocol.  For  instance,  the  PageRank 
vector  can  identify  which  group  members  are  likely  to  disseminate  (mis)information  most 
efficiently.  It  similarly  can  identify  good  roadblock  locations  to  capture  fleeing  suspects. 
A  more  sedate  application  determines  intersections  to  avoid  during  rush  hour.  In  general, 
the  PageRank  algorithm  is  applicable  to  any  scenario  that  requires  assessing  the  probable 
behavior  of  an  object  traversing  the  nodes  in  some  network. 

Certain  networks,  however,  yield  a  PageRank  vector  containing  duplicate  entries, 
which  cannot  induce  a  canonical  vertex  ordering.  This  occurs  most  often  in  networks  that 
are  not  randomly  constructed,  i.e.,  networks  having  some  pattern  of  regularity  among  its 
links  between  nodes.  The  issue  can  be  resolved  in  some  applications,  e.g.,  search  engines, 
by  sorting  on  additional  keys,  such  as  web  page  addresses.  A  second  method,  described  in 
Section  1.2,  is  to  apply  an  application  that  produces  canonical  isomorphs,  e.g.,  nauty,  and 
order  the  canonical  isomorph’s  nodes  using  its  PageRank  vector  [McKSl,  McK04]. 

Networks  containing  nodes  that  yield  equal  PageRank  values  define  an  interesting 
duality.  First,  such  networks  motivate  using  application,  such  as  nauty,  to  find  a  canonical 
PageRank  vector.  Second,  if  a  non-canonical  PageRank  vector  suffices,  which  is  often  the 
case,  such  graphs  also  simultaneously  suggest  methods  for  decreasing  the  time  needed  to 
obtain  the  PageRank  vector.  These  latter  methods  also  eliminate  some  numerical  errors  in 
the  PageRank  vector.  The  three  algorithms  constructed  in  Chapters  4  and  5  leverage  these 
ideas  to  improve  the  PageRank  algorithm’s  performance  if  a  network  contains  nodes  that 
must  have  equal  PageRank  values. 
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2.2.  Deciding  Isomorphism:  A  Classic  Graph-Theoretic  Problem 

2.2.1.  The  Relationship  to  Canonical  Vertex  Ordering 

The  results  described  herein  are  obtained  by  applying  tools  used  in  algorithms  that 

decide  graph  isomorphism  by  finding  a  canonical  vertex  order,  which  induces  a  canonical 
isomorph.  Deciding  graph  isomorphism  involves  deciding  if  the  edge  sets,  and  of 

two  graphs,  Gj  and  G2,  define  equivalent  links  on  their  respective  vertex  sets,  Fj  and  V^. 

For  example.  Figure  6  depicts  two  isomorphs  of  the  triangle  graph.  Since  the  triangle  is  a 
complete  graph,  where  each  pair  of  vertices  is  connected,  every  off-diagonal  entry  in  the 
corresponding  adjacency  matrices  equals  ‘  1’,  as  shown  in  Table  3. 

A  naive  method  of  deciding  graph  isomorphism  is  to  compare  each  permutation  of 
Gj  with  Gj.  For  example,  a  triangle  has  n\  =  V.  permutations,  where  «  =  |F|,  and  the  six 

vertex  permutations  are  <1>  =  |[aZic],[acZi],[Ziac],[&ca],[ca&],[cZia]}.  All  permutations  of 
Gj  equal  Gj,  since  triangles  have  one  unique  isomorph.  The  number  of  permutations, 
|<1>|,  grows  exponentially  in  proportion  to  \v\,  thus,  this  approach  is  generally  intractable. 


(a)  Gj,  The  Triangle 
Figure  6.  Two  Graph  Isomorphs:  The  Triangle  and  Twisted  Rope 

Table  3.  Two  Adjacency  Matrix  Isomorphs:  The  Triangle  and  Twisted  Rope 
(a)  Aj,  The  Triangle  (b)  A2,  Twisted  Rope 
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2.2.2.  Graph  Isomorphism  Applications 

Exposure  to  the  graph  isomorphism  problem,  denoted  GI,  may  infect  researchers 

with  the  isomorphism  “disease”  [ReC77,  Gat79].  The  problem  is  important,  since  it  is  not 
known  if  GI  is  decidable  in  deterministic  polynomial  time  on  arbitrary  graphs.  However, 
GI  is  decidable  in  polynomial  time  for  some  graphs,  e.g.,  trees  [KrS98].  If  GI  defines  a 
new  complexity  class,  it  would  imply  deterministic  polynomial  time  problems,  denoted  P, 
are  a  proper  subset  of  non-deterministic  polynomial  time  problems,  denoted  NP.  Hence, 
if  GI  defines  a  new  complexity  class,  it  also  implies  P  NP,  a  significant  result  currently 
worth  $1,000,000  to  its  discoverer(s)  [CMIOO]. 

A  classic  application  of  algorithms  that  decide  GI  is  identifying  chemical  isomers, 
or  compounds  sharing  the  same  formula  but  having  different  atomic  structures  [Fau98]. 
For  example,  two  isomers  of  CjHjF^  are  shown  in  Figure  7,  which  contain  two  atoms  of 

carbon  (C),  fluorine  (F),  and  hydrogen  (H),  and  the  edges  denote  chemical  bonds  [NIS]. 
Isomers  are  often  stored  in  some  canonical  representation  to  simplify  their  comparison. 

Another  application  finds  a  subgraph  within  some  larger  graph,  e.g.,  finding  small 
circuits  in  large  circuits  [OEG+93].  Algorithms  capable  of  deciding  isomorphism  can  be 
used  in  optical  character  recognition  (OCR)  [WaG04],  to  compare  files  [Car03,  BeC06], 
or  to  analyze  social  networks  patterns,  such  as  enemy  communication  routes  [GCM06].  A 
novel  application  involves  guiding  a  UAV  to  replace  sensor  network  nodes  [CHP+04]. 


H 


F 


F 


F 


H 


F 


H 


H 


(a)  1,1-Difluoroethene  (b)  1,2-Difluoroethene 

Figure  7.  Two  Chemical  Isomers:  CjHjFj 
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2.2.3.  A  Formal  Definition 


A  pair  of  graphs,  Gj  and  G^,  are  isomorphs  if  and  only  if  a  permutation,  exists 


satisfying  (1),  where  for  each  edge  in  E^,  an  equivalent  edge  exists  in  E^.  For  example, 
applying  the  permutation,  (j)  =  [2, 1,3, 4],  to  the  square’s  vertices  illustrated  in  Figure  8(a) 

confirms  the  square  is  an  isomorph  of  the  hourglass  shown  in  Figure  8(b).  In  other  words, 
each  graph  contains  four  vertices,  where  each  vertex  is  adjacent  with  two  other  vertices, 
such  that  the  edges  define  a  cycle  graph  of  length  four,  denoted  C^. 


Gj  =  Gj 


3^{V,)  =  V,  s.t. 

A  <P{v^)eV,  a 


© - @ 

® - © 

(a)  Gj,  The  Square  (b)  Gj,  The  Hourglass 

Figure  8.  Two  Isomorphs:  The  Square  and  Hourglass 


(1) 


Equivalently,  two  adjacency  matrices,  Aj  and  A2,  are  isomorphs  if  and  only  if  a 

permutation  matrix,  P,  exists  satisfying  (2),  where  denotes  the  matrix  transpose,  and 
‘  •  ’  denotes  matrix  multiplication.  A  permutation  matrix,  P,  is  a  row  permutation,  (f),  of  the 
identity  matrix,  I,  a  square  matrix  whose  diagonal  entries  equal  one  that  otherwise  equals 
zero.  Thus,  the  row  permutation  orders  I’s  rows  based  on  the  given  permutation  vector,  cj). 
The  transpose  of  the  permutation  matrix,  P,  denoted  P^,  is  equivalent  to  applying  ^  as  a 
column  permutation  to  I,  i.e.,  ordering  I’s  columns  with  respect  to 

A;  =  A2  o-  3P  s.t.  A2  =  P  •  A;  •  P^  (2) 
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For  example,  the  adjacency  matrices  of  the  square  and  hourglass  graphs  depicted 
in  Figures  are  listed  in  Table  4,  where  Aj  ^  A^.  The  identity  matrix,  denoted  I"*,  is 

listed  in  Table  5(a).  The  permutation  matrix,  P,  and  its  associated  transpose,  P^,  listed  in 
Tables  5(b)  and  (c),  are  obtained  by  permuting  rows  and  columns  of  I"*  with  respect  to 
the  permutation,  ^  =  [2, 1,3, 4],  Multiplying  Aj  by  P  yields  the  matrix  listed  in  Table  6(a) 

and  multiplying  the  result  by  P^  yields  the  matrix  listed  in  Table  6(b).  Since  the  matrices 
shown  in  Tables  4(b)  and  6(b)  are  equal,  i.e.,  since  A2  =  P  •  Aj  •  P^,  Aj  =  A2. 

Table  4.  Two  Isomorphs:  The  Adjacency  Matrices  of  the  Square  and  Hourglass 


(a)  Aj,  The  Square  (b)  Aj,  The  Hourglass 


Table  5.  Identity  Matrix  and  Permutation  Matrices  for  ^  =  [2, 1,3, 4] 


(a)  l\  Identity  Matrix  (b)  Rows,  P  =  Ij'2.1,3,4],:  (c)  Columns,  P^  =  i;][2, 1,3^4] 


Table  6.  Establishing  A2  =  P  •  Aj  •  P^  and  Aj  =  A2 


(a)P-Ai  (b)  A2  =P-Ai-P'^ 


A  permutation,  (j),  has  a  unique  inverse,  denoted  where  =  j  if  and  only  if 

=  i,  i.e.,  equals  the  position,  j,  of  element  i  in  For  example,  for  ^  =  [3, 4, 2, 1,5], 

‘1’  is  in  position  four,  thus,  =4  and  =  [4, 3, 1,2, 5].  The  composition  of  ^and 

is  commutative  and  yields  the  identity  permutation,  hence,  =  ij))  =  [l,2,...,«]. 

One  method  of  obtaining  is  to  augment  with  the  identity  permutation,  [l,2,...,«], 

and  sort  entries  in  (j),  yielding  (p,  as  shown  in  Table  7,  in  0 (w  •  log  n)  time  [Knu97]. 

Table  7.  Computing  the  Inverse  Permutation 
(a)  M  =  (b)  T  =  lex_sort_columns(M)  (c)  =  T  j 

Similarly,  and  are  pair-wise  inverses:  finding  and  Pj  is  equivalent  to 
swapping  </)'s  entries  and  positions  to  obtain  Thus,  permuting  the  rows  of  I"  by  (j)  is 
equivalent  to  permuting  columns  by  and  vice  versa.  Thus,  .  =  I  ^  ,  and  I.  ^  ,  . 

For  example,  in  Table  8,  P;-  =  1^5  2 1 3  q  =  Iqs  2  4  s  i]  ~  ■  Thus,  </>~^  can  be  determined  by 
enumerating  the  column  or  row  heading  of  each  ‘1’  contained  in  P^  or  P^.  Finally,  since 
P"‘  =P^  if  A;  =  A2,  then  A;  =  P •  A^ -P^  =  P^  •  A; -P  =  A^. 


Table  8.  Computing  Inverse  Permutation  Matrices 
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2.2.4.  Canonical  Isomorphs 

A  useful  method  of  deeiding  graph  isomorphism  involves  computing  a  canonical 
isomorph  of  each  graph,  since  isomorphs  yield  identical  canonical  isomorphs  [McK81]. 
For  example,  one  method  of  assessing  if  the  words,  logarithm  and  algorithm,  are  relative 
permutations  is  to  search  for  letters  of  logarithm  in  algorithm  by  searching  for  an  ‘/’  in 

algorithm,  then  an  ‘o’,  a  ‘g’,  and  so  forth,  in  time.  A  faster  method  compares  the 

sorted  letters  in  each  word  in  0(n  •  logn)  time.  In  fact,  logarithm  and  algorithm  yield  the 
same  canonical  isomorph,  aghilmort. 

Similarly,  concatenating  entries  by  column  in  the  upper-right  triangle  of  a  graph’s 
adjacency  matrix.  A,  yields  a  binary  value,  denoted  num(A)  [KrS98].  For  example,  the 

adjacency  matrices  listed  in  Tables  9(a)  and  (b)  are  from  distinct  house  graph  isomorphs. 
The  shaded  upper-right  triangle  in  each  matrix  necessarily  also  must  yield  distinct  values, 
num(Aj)  =  10100111012  =669jo  and  num(A2)  =  11001101012  =821jq,  respectively.  The 
minimum  canonical  isomorph  (MCI),  denoted  A^,  is  the  isomorph  yielding  the  smallest 
binary  number  with  respect  to  all  isomorphs.  The  house  graph’s  MCI,  given  its  60  unique 
isomorphs,  is  listed  in  Table  9(c),  where  num(A^)  =  001 101 1 IOI2  =  221jo. 


Table  9.  Three  Isomorphs  of  the  House  Graph’s  Adjacency  Matrix 
(a)  A,  (b)  A2  (c)  A,,  MCI 
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The  canonical  isomorph,  A^,  of  two  graphs  can  be  obtained  separately,  and  then 
compared  in  time,  e.g.,  in  a  chemical  isomer  database.  If  the  graphs  are  large,  a 

hash  value  of  the  canonical  isomorphs,  e.g.,  MD5,  can  be  used  to  determine  if  two  graphs 
could  be  isomorphs  [GHK+03].  Synonyms  for  the  graph’s  canonical  isomorph  include  its 
certificate,  lexicographic  leader,  or  signature  [Rea72,  BaL83]. 

Formally,  two  isomorphs,  (Aj)^  and  (A2)^,  and  their  respective  permutations. 


(Pj)^  and  (P2)^5  to  some  canonical  isomorph,  A^,  share  the  relationships  illustrated  by 
the  permutation  triangle  shown  in  Figure  9.  The  solid  lines  denote  explicit  relationships, 
(Aj)^  =  A^  and  (Aj)^  =  A^,  yielded  by  (Pi)^  and  (P2)^ »  respectively.  The  dashed  line 

denotes  the  implicit  relationship,  ( Aj  )^  =  ( Aj  )^ ,  yielded  by  ( Aj  )^  =  A^  =  ( Aj  )^ . 


There  are  other  canonical  isomorphs  exist,  like  the  maximum  canonical  isomorph, 
but  the  canonical  isomorph  of  choice  is  the  MCI,  if  only  for  standardization  [KrS98].  The 
MCI  has  many  other  useful  properties.  For  example,  the  graph’s  MCI  also  identifies  the 
graph’s  maximum  independent  set  (MIS),  or  equivalently,  the  maximum  clique  (MC)  of 


the  graph’s  complement  [BaL83,  KrS98]. 


'CO 


Figure  9.  Canonical  Isomorph’s  Permutation  Triangle 
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2.3.  Vertex  Partitions 

A  common  method  for  finding  a  graph’s  canonical  isomorph  is  based  on  pruning  a 
search  tree  using  the  process  described  in  Section  2.3.4.  However,  it  is  first  necessary  to 
define  tools  used  in  that  search  process,  namely,  certain  specific  disjoint  vertex  partitions. 

Vertex  partitions  can  be  constructed  in  many  ways,  where  such  partitions  are  used 
to  prune  the  solution  space  of  various  problems,  e.g.,  determining  a  canonical  isomorph. 
For  instance,  a  readily  obtained  vertex  partition  is  the  degree  partition,  in  which  vertices 
are  grouped  based  on  their  number  of  immediate  neighbors.  For  some  graphs,  the  degree 
partition  is  an  equitable  partition,  where  vertices  contained  in  each  block  have  the  same 
number  of  neighbors  with  respect  to  each  of  the  partition’s  blocks.  Although  the  degree 
partition  is  often  not  equitable,  it  is  often  used  as  the  seed  partition  to  initialize  the  search 
for  more  refined  vertex  partitions,  such  as  the  coarsest  equitable  partition. 

If  vertices  are  initially  placed  in  the  same  block,  the  coarsest  equitable  partition  is 
the  most  refined  partition  that  can  be  obtained  if  the  neighbors  of  every  vertex  are  known. 
The  coarsest  equitable  partition  of  some  graphs  corresponds  to  their  orbit  partition,  which 
can  be  obtained  using  a  pruning  process  similar  to  that  used  to  find  canonical  isomorphs. 
However,  the  coarsest  equitable  partition  can  be  computed  in  deterministic  polynomial 
time  whereas  finding  the  orbit  partition  may  require  exponential  time. 

Significant  attention  is  devoted  to  how  a  coarsest  equitable  partition  is  obtained  to 
facilitate  relating  that  partition  to  the  dot  product  and  PageRank  vector.  The  relationships 
are  established  in  Chapter  3  by  relating  the  process  of  multiplying  entries  in  a  matrix  row 
to  sorting  a  row’s  entries.  The  coarsest  equitable  partition  and  its  induced  quotient  graph 
are  leveraged  in  Chapters  4  and  5  to  improve  the  PageRank  algorithm’s  performance. 
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Given  an  arbitrary  graph,  G  =  (V,E),  a  partition,  B,  is  a  set  of  blocks  containing 
vertices  in  V  such  that  the  union  of  the  blocks  equals  V.  Thus,  if  V  =  {vj,V2,...,v^},  then 

5  =  where  b.  c  V  and  \J"^_bi  =  V.  A  partition  is  proper  if  block  pairs  are 

disjoint,  i.e.,  b.  n  bj  =0,i^  j,  where  a  disjoint  union  is  denoted  V  =  hj  O  hj  u  •  •  •  O  bi^j^gy 
A  unit  partition  contains  one  that  equals  V,  where  5  =  |F}.  Given  a  partition,  B., 
a  refinement  is  an  operation  yielding  some  partition,  B^^^ ,  where  each  block  contained  in 
a  partition,  is  a  subset  of  some  block  in  B..  For  example,  if  5,.  =  {{a,h},{c,  J}},  then 
5.^1  =  |{a,h},  {c},  {J}}  is  a  refinement  of  5,..  However,  B.^^  =  ||a,h,c},  {J}}  cannot  be 
a  refinement  of  B^,  since  block  {a,h,c}  is  not  a  subset  of  block  {a,h}  or  block 

An  ordered  partition,  denoted  B  =  \b^,b2,...,bf\,  induces  a  block  order  such  that 
the  relative  order  of  each  block’s  vertices  is  maintained  in  any  future  partition  refinement. 
For  example,  if  B.  =  [{a,6},  B.^^  =  [{a,6},  {j},  {c}]  maintains  the  relative  block 

order  with  respect  to  block  B..  However,  B.^^  =  [{j},  {c},  {«,&}]  does  not  maintain  the 
block  order  with  respect  to  block  B^,  since  block  {a,b]  is  located  before  block  {c,  J}  in 
block  5,  .  Each  partition  is  assumed  to  be  proper,  i.e.,  disjoint,  and  ordered. 

A  discrete  partition  is  maximally  refined,  where  every  block  contains  one  vertex, 
e.g.,  B.  =  [{<i} ,  {c} ,  {a} ,  {6}].  A  discrete  partition  induces  a  vertex  permutation  that  also 

induces  isomorph  of  the  input  graph.  Finding  canonical  discrete  partitions  is  a  key  goal  in 
many  applications  that  produce  canonical  graph  isomorphs,  e.g.,  nauty  [McK81]. 
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The  maximal  set  of  ordered  vertex  partitions  yields  a  partition  tree.  The  root  node 
is  the  coarsest  possible  partition,  the  unit  partition,  where  each  descendant  is  a  refinement 
of  its  ancestor  partition.  Partitions  along  a  path  in  this  search  tree  retain  vertices  in  their 
block-relative  order.  Every  leaf  node  is  an  ordered  discrete  partition,  where  the  number  of 
leaf  partitions  corresponds  to  the  number  of  vertex  permutations,  \v\\. 

For  example,  the  unit  partition  of  a  3-vertex  graph  is  [|a,Z),c}],  the  root  node  of 

the  corresponding  partition  tree  shown  in  Figure  10.  The  root  node’s  descendants  are  the 
smallest  possible  refinements  of  the  graph’s  unit  partition,  where  the  blocks  are  ordered 
by  the  number  of  vertices  contained  in  each  block.  The  leaves  are  the  discrete  partitions 
corresponding  to  the  six  possible  vertex  permutations  of  a  3-vertex  graph.  For  instance, 

[{a},  {&},  {c}]  and  [{c},  {&},  {a}]  correspond  to  the  permutations,  [1,2,3]  and  [3,2,1], 

respectively.  The  performance  yielded  by  nauty,  and  most  methods,  that  find  a  canonical 
isomorph,  is  obtained  by  dramatically  pruning  the  number  of  nodes  in  the  vertex  partition 
tree  using  information  yielded  by  the  graph’s  edges. 


Figure  10.  Vertex  Partition  Tree  of  an  Arbitrary  3 -Vertex  Graph 
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2.3.1.  The  Degree  Partition 

There  are  many  methods  exist  for  pruning  an  arbitrary  graph’s  partition  tree.  One 
method  is  to  group  vertices  by  their  degrees,  or  number  of  neighbors,  denoted  deg(v^), 

where  eV,  e.  ={v^,v^]  e  E,  and  deg (v^ )  =  n e,. | ,  i  <  |£'| .  The  degree  sequence  is  an 
example  of  a  graph  invariant,  i.e.,  isomorphs  have  the  same  degree  sequence.  Similarly, 
isomorphs  contain  an  equal  number  of  vertices  and  edges,  i.e.,  |Fi  |  =  1^2 1  \^\  H  1^2 1  ■ 

Thus,  one  method  of  checking  if  two  graphs  may  be  isomorphs  is  to  compare  the 
number  of  vertices  and  edges  each  graph  contains  and  their  degree  sequences.  Formally, 
an  invariant  is  a  property,  y/,  isomorphs  must  share,  for  instance,  \V^\  =  1^21  and  |£’i|  =  \E2\ 
if  Gj  =  Gj.  Conversely,  a  complete,  or  sufficient,  invariant  is  an  invariant  that  completely 

decides  graph  isomorphism,  such  as  a  canonical  isomorph. 

The  degree  sequence  is  often  used  to  determine  two  graphs  cannot  be  isomorphs. 
For  example,  the  house  graph  and  the  complete  graph,  K^,  shown  in  Figure  11  cannot  be 

isomorphs,  since  they  yield  two  different  degree  sequences,  [2, 2, 2, 3, 3]  and  [4, 4, 4, 4, 4], 
respectively.  The  house  graph  and  similarly  have  distinct  degree  partitions,  which  are 
and  respectively. 


(a)  Gj,  House  Graph  (b)  G2  = 

Figure  11.  Two  Graphs  Yielding  Different  Degree  Sequences 
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However,  certain  non-isomorphic  graphs  yield  identical  sorted  degree  sequences. 
For  example,  the  house  graph  and  the  complete  bipartite  graph,  shown  in  Figure  12 

yield  the  same  sorted  degree  sequence,  [2, 2, 2, 3, 3],  which  yield  the  degree  partitions, 

and  {«,&}],  respectively.  However,  simple  visual  inspection 

reveals  these  two  graphs  are  not  isomorphs.  Two  non-isomorphic  graphs  having  the  same 
value  with  respect  to  an  arbitrary  invariant  are  said  to  be  a  devil’s  pair  [RosOO].  Thus,  the 
house  graph  and  are  a  devil’s  pair  with  respect  to  the  sorted  degree  sequence. 

Degree  sequences  have  a  more  elementary  shortcoming.  Ideally,  a  graph  invariant 
uniquely  labels  each  vertex  and  induces  a  canonical  vertex  ordering.  However,  if  |l^|  ^  2, 

it  is  impossible  to  uniquely  label  the  vertices  using  the  degree  sequence,  since  at  least  two 
vertices  have  equal  degrees.  First,  a  connected  graph  on  «  >  2  vertices  has  a  degree  range 
of  [l,  n  -l].  By  the  pigeonhole  principle,  at  least  two  vertices  must  have  the  same  degree, 

since  the  range  only  contains  n-\  distinct  values,  whereas  the  graph  contains  n  vertices. 
Therefore,  at  least  one  block  in  the  degree  partition  contains  two  or  more  vertices,  which 
precludes  obtaining  a  discrete  partition  that  induces  a  canonical  vertex  ordering.  A  similar 
approach  generalizes  this  proof  to  disconnected  graphs. 


(a)  Gj ,  House  Graph  (b)  G2  = 

Figure  12.  Devil’s  Pair  for  the  Sorted  Degree  Sequence 
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2.3.2.  The  Equitable  Partitions 

An  equitable  partition  is  an  extension  of  the  degree  partition,  sinee  every  vertex 
contained  in  the  same  block  of  an  equitable  partition  yields  the  same  degree.  However,  an 
equitable  partition  adds  the  criterion  that  vertices  in  the  same  block  have  an  equal  number 
of  neighbors  with  respect  to  any  block  in  the  partition,  including  their  own  block.  Thus, 
an  equitable  partition  extends  the  concept  of  vertex  degrees  to  block  degrees,  where  given 
an  equitable  partition,  B,  and  an  arbitrary  pair  of  blocks,  b.,bj  e  S,  each  vertex  contained 

in  h.  has  an  equal  number  of  neighbors  contained  in  bj,  and  vice  versa. 

Given  an  arbitrary  graph,  G  =  {V,E),  a  partition,  B,  is  determined  to  be  equitable 

by  applying  the  vertex  neighborhood,  denoted  N{^v),  where  veV  and  (v)|  =  deg(v). 

The  neighborhood  of  an  arbitrary  vertex,  v,  is  its  set  of  adjacent  vertices,  where  [KrS98] 

A^(v)  =  e  F  :  {m,v}  e  E}  .  (3) 

Therefore,  an  arbitrary  partition,  B  =  \b^,b2,...,bi^\,\<k  <n,  where  n  =  |F|,  is  equitable 
with  respect  to  G,  if  for  all  i  and  j,  1  <  i,  j  <  k,  and  for  all  u,vg  b^ , 

|AA(M)n6^.|  =  |A^(v)nZ7^.|.  (4) 

Given  two  vertices,  u  and  v,  contained  in  an  arbitrary  block,  b^,  u  and  v  have  the 
same  number  of  neighbors  in  any  block,  bj,  including  the  case,  i  =  j.  However,  vertices 

u  and  V  do  not  have  to  share  the  same  neighbors  in  each  block,  only  the  same  number  of 
neighbors.  The  number  of  neighbors  may  differ  across  blocks,  i.e.,  it  is  not  required  that 

iu^r\bj^  =  \^N {u^r\bi^y  j  ^  k  or  |A/'(v)n6^.|  =  |A'(v)n6^|,  y  F  Thus,  it  is  acceptable 


For  example,  the  house  graph  shown  in  Figure  13(a)  yields  the  ordered  partition, 
[{a,c,J},  {Z7,e}].  This  partition  is  not  equitable,  since  vertices  c  and  d  are  each  adjacent 


to  one  vertex  in  block  {&,e} ,  whereas  vertex  a  is  a  neighbor  of  both  vertices  contained  in 
block  A  second  reason  this  partition  is  not  equitable  is  that  block  {a,c,(i}  contains 

two  adjacent  vertices,  c  and  d,  whereas  vertex  a  is  not  adjacent  to  any  vertex  contained  in 
block  {a,c,(i}.  Thus,  the  vertices  contained  in  the  block,  [a,c,d],  do  not  share  an  equal 

number  of  neighbors  with  respect  to  any  of  the  blocks  contained  in  this  partition. 

Conversely,  the  graph,  K^.^,  depicted  in  Figure  13(b)  yields  the  degree  partition, 

[{c,  J,e},  This  partition  is  equitable,  since  the  vertices  in  each  block  are  adjacent 

to  all  vertices  in  the  other  block,  but  none  of  the  vertices  in  their  own  block.  The  graph, 
K^,  shown  in  Figure  13(c)  similarly  yields  the  equitable  degree  partition,  [{a,Z),c,t/,e}]. 


The  graph,  K^,  is  4-regular,  where  vertices  in  a  A:-regular  graph  have  k  neighbors. 


Thus,  the  degree  partition  of  an  arbitrary  ^-regular  graph  is  also  equitable.  Moreover,  the 
degree  partition  of  an  arbitrary  A:-regular  graph  equals  its  unit  partition.  However,  the  unit 
and  degree  partition  of  most  graphs  are  not  equitable,  e.g.,  the  house  graph. 


Figure  13.  Exploring  Equitable  Partitions 


{c)G,=K, 
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2.3.3.  The  Coarsest  Equitable  Partition 

Fortunately,  given  an  arbitrary  graph  and  an  arbitrary  initial  partition,  a  speeific 

equitable  partition  is  always  guaranteed  to  exist.  The  coarsest  equitable  partition  is  the 
most  refined  partition  that  can  be  obtained  using  only  the  information  that  is  derived  from 
the  neighbors  of  each  vertex.  The  coarsest  equitable  partition  is  a  key  tool  used  to  prune 
the  partition  tree  while  finding  a  canonical  isomorph.  For  instance,  on  nearly  all  random 
graphs,  the  coarsest  equitable  partition  is  a  discrete  partition.  Thus,  the  coarsest  equitable 
partition  of  most  random  graphs  prunes  the  partition  tree  to  a  single  leaf,  which  induces  a 
canonical  vertex  ordering.  The  results  described  in  Chapters  3-5  establish  the  PageRank 
vector  can  be  obtained  more  efficiently  if  the  coarsest  equitable  partition  is  non-discrete, 
i.e.,  contains  blocks  composed  of  multiple  vertices. 

For  example,  the  mansion  graph  shown  in  Figure  14(a)  yields  the  discrete  coarsest 

equitable  partition,  [{6},  {/},  {c},  {e},  {J},  {a}],  and  induces  the  canonical  vertex  order 
illustrated  in  Figure  14(b).  Conversely,  the  house  graph  shown  in  Figure  14(c)  yields  the 
coarsest  equitable  partition,  [{c,  J},  {a},  which  cannot  induce  a  canonical  order. 

The  results  described  in  Chapters  4  and  5  improve  the  PageRank  algorithm’s  performance 
on  graphs  whose  coarsest  equitable  partition  is  non-discrete,  such  as  the  house  graph. 


(a)  Gj,  Mansion  Graph  (b)  Canonical  Ordering  of  Gj  (c)  Gj,  House  Graph 
Figure  14.  Coarsest  Equitable  Partitions  and  Canonical  Orderings 
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2.3.3.I.  A  Formal  Method 

The  method  described  in  this  section  for  finding  the  coarsest  equitable  partition  is 
derived  from  the  definition  of  an  equitable  partition  (4)  and  restated  here  for  convenience. 
Given  some  partition,  5  =  [Zij ,  Zij , . . . ,  ] ,  1  <  A:  <  n,  a  partition  is  equitable  with  respect  to 

an  arbitrary  graph,  G  =  (V,E),  if  for  all  i  and  j,  1  <  i,j  <  k,  and  for  all  u,ve  b.,  [KrS98] 

|7V(M)nZ>.|  =  |A(v)n6.|,  (5) 

where  «  =  |f|  and  N[u)  denotes  a  set  containing  the  neighbors  of  vertex  u. 

A  simple  method  of  satisfying  this  definition  and  computing  the  coarsest  equitable 
partition  is  to  proceed  as  follows.  Given  an  arbitrary  partition,  such  as  the  unit  partition, 
select  an  arbitrary  block  in  the  partition  containing  two  or  more  vertices.  Then,  select  the 
first  vertex  in  the  block  and  identify  the  number  of  neighbors  of  that  vertex  relative  to  all 
blocks  in  the  partition.  Repeat  this  process  for  all  vertices  contained  in  the  block.  Vertices 
yielding  the  same  number  of  neighbors  with  respect  to  every  block,  or  identical  sorted 
block  degree  sequences,  are  split  into  unique  blocks.  If  the  selected  block  cannot  be  split, 
repeat  the  process  for  all  blocks  containing  multiple  vertices  until  some  block  splits.  A 
split  block  is  ordered  with  respect  to  the  vertex  degree  and  block  size  in  either  ascending 
or  descending  order.  If  none  of  the  blocks  split  after  some  iteration  of  this  process,  the 
vertex  partition  has  stabilized  to  the  coarsest  equitable  partition  (5). 

This  naive  partition  refinement  algorithm’s  execution  time  has  an  upper  bound  of 

where  n  =  \v\  [KaS83,  PaT87].  If  a  graph  is  stored  as  a  set  of  adjacency  lists,  the 

algorithm’s  upper  bound  is  0{m-n),  where  m  =  |£'|.  This  algorithm’s  key  contribution  is 
as  an  easily  described  method  of  finding  the  graph’s  coarsest  equitable  partition. 
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For  example,  the  unit  partition  yielded  by  the  house  graph  shown  in  Figure  15(a) 
is  [{a,Z7,c,  Thus,  the  first  block  containing  more  than  one  vertex  is  the  unit  block, 

{a,&,c,J,e}.  Three  vertices,  [a,c,d],  are  adjacent  to  two  vertices  in  block  \a,b,c,d,e\, 
whereas  two  vertices,  [b,e\,  are  adjacent  to  two  vertices  in  block  {^a,b,c,d,e\.  Thus,  the 
refined  partition  yielded  by  this  step,  after  ordering  by  the  vertex  degree  and  the  number 
of  elements  contained  in  the  resulting  blocks,  is  [|n,c,  J},  The  degree  partition  is 

illustrated  using  unique  shapes  in  the  graph  shown  in  Figure  15(b). 

Another  iteration  must  be  performed  to  assess  if  the  current  partition  is  stable.  The 
first  block,  {a,c,(i},  contains  multiple  vertices  where  vertex  a  contains  zero  neighbors  in 

block  {a,c,  J}  and  two  neighbors  in  block  Conversely,  vertices  c  and  d  each  have 

one  neighbor  in  block  ,  and  one  neighbor  in  block  {b,e^ .  Thus,  a  new  partition  is 

obtained,  [{«},  {Z),e}],  as  shown  using  unique  shapes  in  Figure  15(c).  Performing 

a  third  iteration  confirms  the  partition  is  stable,  i.e.,  no  more  block  refinement  will  occur. 
Hence,  the  house  graph’s  coarsest  equitable  partition  is  [{n}  ,\c,d], 


(a)  House  Graph  (b)  Degree  Partition  (c)  Coarsest  Equitable  Partition 

Figure  15.  Method  1:  Finding  the  House  Graph’s  Coarsest  Equitable  Partition 
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2.3.3.2.  A  Fast  Method 

More  efficient  methods  exist  for  computing  the  coarsest  equitable  partition,  e.g., 
the  algorithm  described  by  Cardon  and  Crochemore  [CaC82].  Paige  and  Tarjan  suggested 
a  simpler  algorithm  [PaT87]  described  more  extensively  by  Kreher  and  Stinson  [KrS98]. 

The  method’s  upper  bound  is  0{n^  -  Xogn^  and  reduces  to  0(^(m  +  n)-logn)  if  a  graph  is 
stored  in  sparse  form,  where  n  =  |f|  and  m  =  l^l  [BLS+06].  The  key  data  structures  are: 

•  the  current  vertex  partition,  B, 

•  a  test  vertex  set,  U, 

•  a  test  block  set,  S,  and 

•  a  block  set,  Z, 

where  Z  is  ordered  on  the  number  of  vertices  in  a  test  block,  s.  e  S,  that  are  neighbors  of 
vertices  contained  in  blocks  of  the  partition,  B,  where  s<^U.  The  partition,  B,  is  said  to 
be  equitable  after  the  potential  test  block  set,  S,  has  been  emptied. 

The  algorithm  listed  in  Figure  16  is  a  commented  variant  of  Kreher  and  Stinson’s 
method  [KrS98].  The  unit  partitions  for  B  and  S  are  created  on  lines  2-A.  The  main  loop 
is  entered  on  line  5  and  iterates  until  the  test  block  set,  S,  is  empty.  A  test  block,  s  e.S,  is 
selected  on  line  7  and  its  vertices  are  checked  for  validity  on  line  9  by  assessing  if  s<^U. 
If  all  vertices  in  s  are  valid,  they  are  compared  to  each  non-discrete  block,  b^  e  B,  \b^\  >  1. 


The  number  of  vertices  in  s  adjacent  to  the  vertices  in  block  b^  are  stored  in  the 
block  set,  Z,  as  shown  on  lines  14-21.  If  Z  contains  two  or  more  disjoint  sub-blocks  of 
block  b^ ,  i.e.,  if  block  b.  is  subsequently  replaced  by  those 

sub-blocks,  2  „_i  e  Z  :  Zj  ^  0,  on  lines  24  and  25.  The  sets  of  potential  test  blocks,  S, 
and  valid  test  vertices,  U,  are  updated  on  lines  7-10  and  lines  26-29. 


29 


1. 

2. 

3. 

4. 

5. 

6. 

7. 

8. 
9. 

10. 

11. 

12. 

13. 

14. 

15. 

16. 

17. 

18. 

19. 

20. 
21. 
22. 

23. 

24. 

25. 

26. 

27. 

28. 

29. 

30. 

31. 

32. 

33. 

34. 

35. 


findCoarsestPartition(G,  n) 

#  create  initial  unit  partitions 
5  <—  |l,2,...,n} 

S  <—  [{l,2,...,n}^ 
while  S  ^0 

#  remove  arbitrary  unprocessed  test  block,  5 
s  4—  S^ 

S^S-s 
if  s 

U<^U-s 

#  initialize  tracking  sets  for  number  of  neighbors  in  block 
Z  =  [z„z2,...,z„_,],z.=0 

#  iterate  over  blocks  in  current  partition 
foreach  block,  b.  e  B,  \b^  |  >  1 

#  iterate  over  each  vertex  in  current  block,  b^  e  B 
foreach  vertex,  v  eb.,  |Z).  |  >  1 

#  compute  number  of  neighbors  of  v  in  test  block,  .s 
d  =  |getNeighborhood(v)  n^l 

#  assign  v  to  vertex  set,  z^,  with  same  number  of  neighbors 


end  foreach 

#  if  block  list,  Z,  contains  multiple  non-empty  blocks,  update 


.,n-\ 


:Z:z. 


#  replace  old  block,  b^,  with  neighbor-degree  ordered  blocks  in  Z 


B  = 


5  ^2  ’  •  •  •  ’  ^i-l  ’  •  •  •>  ^n-\  ’  ^i+1  ’  ^i+2 !  •  •  • !  ^k=\B\  ’ 


#  append  blocks  in  neighbor  lists,  Z,  to  potential  block  list,  S 
S  =  SuZ 

#  append  vertices  in  neighbor  lists,  Z,  to  valid  vertex  list,  U 


f/  =  f/uz,.^i  2„.,,„_i  eZ,z^.  9^0 

end  if 
end  for 
end  if 
end  while 
return  B 


end  fmdCoarsestPartition 


Figure  16.  Method  2:  Fast  Algorithm  for  Finding  Equitable  Partitions  [PaT87,  KrS98] 
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Applying  the  algorithm  to  the  house  graph  illustrated  in  Figure  17  yields  the  data 
structures  listed  in  Table  10,  where  the  vertex  set,  U,  is  not  listed,  since  each  block  placed 
in  S  is  valid.  The  final  refinement  occurs  when  =  {b,e^  and  s  =  [^a,c,d^,  which  again 

shows  the  house  graph’s  coarsest  equitable  partition  is  [{a},  {c,  J}, 


(a)  House  Graph  (b)  Iteration  1  (c)  Coarsest  Equitable  Partition 

^1^, Z),  c,  ^1^2,  c,  I ,  |Z),  1 ,  ,  |Z), 

Figure  17.  Method  2:  Finding  the  House  Graph’s  Coarsest  Equitable  Partition 


Table  10.  Method  2:  Finding  the  House  Graph’s  Coarsest  Equitable  Partition 


i 

B 

S 

Z 

1 

[a,b,  c,d,e^ 

\_{a,b,c,d,e]\ 

{a,b,c,d,e] 

\_{a,b,c,d,e]\ 

2  = 

3 =  {b,e]  \ 

2 

{a,c,(i} 

\_{a,c,d],{b,e]\ 

1 - 1 

II  II 

O  (N 

1 _ 1 

3 

{b,e] 

\{a],{c,d],{b,e]\ 

\{b,e],{a],{c,d]] 

[2  =  {b,e\\ 

4 

{c,d] 

{b,e} 

[{a},{c,tZ}] 

[l  =  {c,J}] 

5 

{b,e] 

[Q  =  {b,e]\ 

6 

{c,d] 

[a] 

[{c,d}] 

[0  =  {c,J}] 

7 

{b,e} 

\2  =  {b,e]\ 

8 

{c,d] 

[c,d] 

[] 

[l  =  {c,d}] 

9 

{b,e} 

[l  =  {b,e]] 
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2.3.3.3.  A  Facile  Method 

The  third  and  final  algorithm  for  computing  a  graph’s  coarsest  equitable  partition, 
1 -dimensional  (1-D)  Weisfeiler-Lehman  stabilization,  has  been  repeatedly  discovered  and 
is  described  most  extensively  by  Weisfeiler  and  Lehman  [Wei76,  ReC77,  CFI92,  Bas02]. 
This  stabilization  method  can  also  be  used  in  many  contexts,  e.g.,  1-D  Weisfeiler-Lehman 
stabilization  is  used  in  Chapter  3  to  establish  vertices  contained  in  the  same  block  of  the 
graph’s  coarsest  equitable  partition  must  have  equal  PageRank  values. 

All  vertices  are  first  labeled  by  its  degree,  as  shown  on  line  3  in  Figure  18.  These 
labels  are  augmented  with  their  sorted  neighbor  labels  and  replaced  by  shorter  labels  to 
conserve  memory,  as  shown  on  lines  9-11.  The  labeling  process  iterates  until  the  labels 
are  unchanged  with  respect  to  consecutive  iterations,  i.e.,  if  the  old  partition,  S,  equals  the 
new  partition,  B,  on  line  7.  The  stabilized  partition,  B,  is  the  coarsest  equitable  partition. 

1 .  fmdCoarsestPartition  (  G,  n  ) 

2.  #  initialize  labels  to  vertex  degrees  and  partition 

3.  ^deg(vj,v^  gF 

4.  5  [|l,2,...,n}J 

5.  S<^0 

6.  #  iterate  until  sorted  vertex  labels  stabilize 

7.  while  S  ^  B 

8.  S<^B 

9.  ^[z^,sort({Z^  :(v„vje£})_ 

10.  getUniqueLabels  ( Z^ ) 

11. 

12.  end  while 

13.  return  5 

14.  end  findCoarsestPartition 

Figure  18.  Method  3:  1-D  Weisfeiler-Lehman  Stabilization 
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For  example,  the  house  graph  yields  the  degree  labels  and  neighbor  labels  shown 
in  Figures  19(a)  and  (b),  respectively.  Shorter  unique  identifier  labels  are  assigned  to  each 
vertex,  as  shown  in  Figure  19(c).  In  this  example,  no  more  partition  refinement  occurs  if 
the  vertex  labels  are  augmented  with  the  sorted  neighbor  labels,  as  shown  in  Figure  19(d). 
Repeating  the  process  only  yields  replicates  of  the  graphs  shown  in  Figures  19(c)  and  (d). 

Thus,  the  house  graph’s  coarsest  equitable  partition  is  [{c,(i},  {a}, 


3, 1,2,3 

3, 1,2,3 

- 1 

1,1,3 

Figure  19.  Method  3:  1-D  Weisfeiler-Lehman  Stabilization  Example 


Hence,  1-D  Weisfeiler-Lehman  stabilization  refines  the  unit  partition  to  the  degree 
partition,  as  depicted  in  Figures  20(a)  and  (b),  respectively.  Sorting  the  adjacent  labels  of 
every  vertex  and  appropriately  assigning  distinct  labels  until  stabilization  occurs  produces 

the  coarsest  equitable  partition,  e.g.,  [{c,(i},  {a},  as  illustrated  in  Figure  20(c). 


(a)  House  Graph 
[{a,b,c,d,e]\ 


(b)  Degree  Partition 
[{a,c,d],{b,e]\ 


0 - ® 

(c)  Coarsest  Equitable  Partition 
[{c,d],{a],{b,e]\ 


Figure  20.  Method  3:  Finding  the  House  Graph’s  Coarsest  Equitable  Partition 
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1-D  Weisfeiler-Lehman  stabilization  can  also  be  readily  illustrated  using  matrices. 
For  example,  the  adjacency  matrix  of  the  house  graph  is  depicted  in  Table  11(a).  Sorting 
the  entries  by  row,  from  left  to  right,  yields  the  matrix  shown  in  Table  11(b).  Assigning  a 
shorter,  but  similarly  unique,  label  to  each  of  the  sorted  rows  yields  the  labels  shown  in 
Table  12(a),  where  these  labels  correspond  to  the  degree  partition.  Repeating  the  process 
using  these  shorter  labels  yields  the  matrices  shown  in  Tables  12(b)  and  (c),  respectively. 
Repeating  the  process  yields  the  matrices  listed  in  Tables  12(d)-(f)  and  future  iterations 

also  yield  the  same  partition,  i.e.,  the  coarsest  equitable  partition,  [{c,  J},  {a},  {h,e}]. 


Table  11.  The  House  Graph’s  Adjacency  and  Sorted  Degree  Matrices 


(a)  A 


a 

h 

C 

d 

e 

0 

1 

0 

0 

1 

D 

1 

0 

1 

0 

1 

B 

0 

1 

0 

1 

0 

m 

0 

0 

1 

0 

1 

e 

1 

1 

0 

1 

0 

(b)  sort  (a) 


sorted  left  to  right 

0 

0 

0 

1 

1 

D 

0 

0 

1 

1 

1 

B 

0 

0 

0 

1 

1 

0 

0 

0 

1 

1 

e 

0 

0 

1 

1 

1 

Table  12.  Method  3:  1-D  Weisfeiler-Lehman  Stabilization  of  the  House  Graph 
(a)  Degree  Labels  (b)  Sorted  Labels  I  (c)  Shorter  Labels  I 


a 

1 

b 

2 

c 

1 

d 

1 

e 

2 

c 

1 

1 

2 

d 

1 

1 

2 

a 

1 

2 

2 

b 

2 

1 

1 

2 

e 

2 

1 

1 

2 

c 

1 

d 

1 

a 

2 

b 

3 

e 

3 

(d)  Unique  Labels 


(e)  Sorted  Labels  II 


(f )  Shorter  Labels  II 


c 

1 

d 

1 

a 

2 

b 

3 

e 

3 

a 

2 

b 

3 

c 

1 

d 

1 

e 

3 

c 

1 

1 

3 

d 

1 

1 

3 

a 

2 

3 

3 

b 

3 

1 

2 

3 

e 

3 

1 

2 

3 
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The  coarsest  equitable  partition  is  unique  up  to  a  block  permutation.  For  example, 
applying  the  methods  described  in  Sections  2.3.3. 1  or  2. 3. 3. 2  to  the  house  graph  yields 

[{a},{c,  Blocks  [a]  and  {c,d]  exchange  positions  if  1-D  Weisfeiler-Lehman 

stabilization  is  applied,  yielding  [{c,  To  avoid  this  issue,  the  same  method 

should  be  used  to  compute  the  coarsest  equitable  partition  throughout  an  application. 

Each  iteration  of  1-D  Weisfeiler-Lehman  stabilization  requires  sorting  n  vectors  of 

length  n  and  finding  shorter  labels  for  each  vector,  where  each  step  is  0{n^  •  log«).  Since 
up  to  logj  rd  iterations  are  required  to  achieve  stabilization,  e.g.,  for  path  graphs  [Bas02], 
the  upper  bound  is  0{rd  ■  log  n  •  log  rd  ) .  This  bound  reduces  to  0{d-n-  log  n  ■  log  if  a 

graph  is  stored  using  adjacency  lists,  where  d  =  max  (deg  (v,.))  [Bas02]. 

The  method  described  in  Section  2. 3. 3.2  for  finding  a  coarsest  equitable  partition 
is  more  efficient,  since  its  upper  bound  is  merely  0[rd  -logn).  Thus,  Weisfeiler-Lehman 

stabilization  is  slower  by  a  factor  of  {l-rd  -  Xogn-Xogrd^l^rd  -logn)  =  2 -logn^  =  4-log « 

with  respect  to  the  more  efficient  method.  However,  1-D  Weisfeiler-Lehman  stabilization 
is  simpler  to  describe  and  easier  to  implement  correctly  [PaT87]. 

Moreover,  the  similarity  between  performing  1-D  Weisfeiler-Lehman  stabilization 
and  performing  matrix  multiplication  yields  the  proof  contained  in  Chapter  3  establishing 
a  relationship  between  the  coarsest  equitable  partition  and  dot  product.  Finally,  the  proof 
yields  three  progressively  sophisticated  methods  of  improving  the  PageRank  algorithm’s 
performance,  as  described  in  Chapters  4  and  5.  These  methods  improve  certain  numerical 
properties  of  the  PageRank  vector  and  the  latter  two  methods  reduce  execution  time. 
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2.3.4.  The  Orbit  Partition 

The  orbit  partition  is  obtained  by  separating  the  graph’s  set  of  permutations  into 
two  disjoint  elasses:  isomorphisms  and  automorphisms.  An  isomorphism  is  a  permutation 
yielding  a  distinet  isomorph,  where  one  or  more  edges  receive  new  labels.  Conversely,  an 
automorphism  is  a  permutation  yielding  the  original  graph,  i.e.,  edges  retain  their  original 
label.  If  an  isomorphism  is  applied,  the  associated  adjacency  matrices  are  different,  but  if 
an  automorphism  is  applied,  the  adjacency  matrices  are  identical. 

For  example,  applying  the  permutation,  =  [2, 1,3, 4],  to  the  square  illustrated  in 
Figure  21(a)  yields  the  isomorph  shown  in  Figure  21(b),  which  is  also  unique,  since  some 
edges  receive  different  labels,  e.g.,  edges  {a,  and  respectively.  Conversely,  the 


automorphism  illustrated  in  Figure  21(b)  yields  an  isomorph  with  the  same  edge  labels. 
The  same  effect  is  observed  in  Table  13,  where  the  square’s  and  the  isomorph’s  adjacency 
matrices  differ,  but  the  square’s  and  the  automorph’s  adjacency  matrices  are  identical. 


© - ®  ® - ®  © - ® 

® - ©  © - ©  © - © 

(a)  The  Square,  G  (b)  An  Isomorph  of  G  (c)  An  Automorph  of  G 

^?i,=[2,l,3,4]  =[3,4, 1,2] 

Figure  21.  Isomorph  and  Automorph  of  the  Square 


Table  13.  Isomorph  and  Automorph  of  the  Square 


(a)  A 


a 

b 

C 

d 

a 

0 

1 

0 

1 

b 

1 

0 

1 

0 

c 

0 

1 

0 

1 

d 

1 

0 

1 

0 

(b)  Ai  =  34]  (A) 


b 

a 

C 

d 

b 

0 

1 

1 

0 

a 

1 

0 

0 

1 

c 

1 

0 

0 

1 

d 

0 

1 

1 

0 

(c)  A2=<2|3  4_j_3](A) 


c 

d 

a 

b 

C 

0 

1 

0 

1 

d 

1 

0 

1 

0 

a 

0 

1 

0 

1 

b 

1 

0 

1 

0 
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All  graphs  yield  the  trivial  automorphism,  or  identity  permutation,  =  [l,  2, . . . ,  n] . 

Most  automorphisms  send  some  vertices  to  another,  albeit  logically  equivalent,  position. 
If  an  arbitrary  vertex,  u,  is  sent  to  another  vertex,  v,  by  some  number  of  automorphisms, 
an  equal  number  of  automorphisms  send  v  to  u,  where  such  vertices  are  in  the  same  orbit. 
The  orbit  partition  is  the  disjoint  equitable  vertex  partition  listing  the  graph’s  orbits. 

The  orbit  partition  equals  or  is  more  refined  than  the  coarsest  equitable  partition. 
For  example,  the  house  graph’s  coarsest  equitable  and  orbit  partition  are  equal,  as  shown 
in  Figure  22.  Conversely,  the  coarsest  equitable  partition  of  the  cuneane  graph  shown  in 

Figure  23(a)  [TiK99,  StT99]  contains  a  single  block,  but  its  orbit  partition 

contains  three  blocks,  [{a,  A} ,  { J, e} ,  {b, c, /, g}] ,  as  shown  in  Figure  23(b). 


Figure  22.  House  Graph’s  Coarsest  Equitable  and  Orbit  Partition,  [{c,  J},  {a},  {Zi,e}] 


(a)  Coarsest  Equitable  Partition 


(b)  Orbit  Partition 


Figure  23.  Cuneane  Graph’s  Distinct  Coarsest  Equitable  and  Orbit  Partitions 
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2.3.4.I.  Vertex  Individualization  and  Arbitrary  Partition  Stabilization 

Finding  the  orbit  partition  involves  applying  many  techniques,  especially  if  the 

coarsest  equitable  partition  is  not  discrete,  i.e.,  contains  one  or  more  blocks  composed  of 
multiple  vertices.  The  individualization  process  involves  splitting  a  non-discrete  block 
and  assigning  a  single  vertex  to  its  own  block,  where  one  or  more  blocks  are  successively 
split  each  possible  way.  Since  the  partition  induced  by  applying  individualization  may  not 
be  equitable,  the  method  used  to  compute  the  initial  coarsest  equitable  partition  is  applied 
to  the  individualized  partition,  thus  yielding  a  more  refined  and  equitable  partition. 

For  example,  applying  Wei  sfei  1  er-Lehman  stabilization  to  the  house  graph  yields 

the  coarsest  equitable  partition,  [{c,  J},  {a},  {h,e}],  shown  in  Figure  24.  Two  blocks  are 

not  discrete,  thus,  one  or  more  block’s  vertices  are  individualized.  For  example,  splitting 
block  {b,e]  by  individualizing  vertex  b  induces  the  partition  shown  in  Figure  25(a).  This 

partition  is  not  equitable,  since  block  {c,  J}  contains  a  vertex,  c,  adjacent  to  one  vertex  in 


block  ,  namely,  vertex  b,  whereas  vertex  d  is  not  also  adjacent  to  vertex  b.  Applying 

1-D  Wei  sfei  1  er-Lehman  stabilization  to  the  partition  obtained  by  individualizing  vertex  b 
yields  the  discrete  and  equitable  partition  shown  in  Figure  25(b).  The  other  three  potential 
vertex  individualizations  are  shown  in  Figures  25(c)-(h). 


0 - ® 

Figure  24.  House  Graph’s  Coarsest  Equitable  Partition,  {«},  {h,e}] 
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(a)  Individualizing  b 
\_{c,d],{a],{b],{e]\ 


<c> - (d) 

(c)  Individualizing  c 
\{c],{d],{a],{b,e]\ 


© - 

(e)  Individualizing  d 
\{d],[c],{a],{b,e]\ 


© - ® 


(g)  Individualizing  e 
\{c,d],{a],{e],{b]\ 


(b)  Equitable  Partition 


<c> - © 

(d)  Equitable  Partition 


© — 

(f)  Equitable  Partition 


(h)  Equitable  Partition 


Figure  25.  Refining  to  Equitable  Partitions  after  Vertex  Individualization 
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2.3.4.2.  Computing  the  Orbit  Partition 

Many  methods  have  been  deseribed  for  determining  graph  isomorphism  on  some 

graphs  [CoG70,  Rea72,  BES80,  BaL83].  Many  algorithms  have  also  been  developed  that 
decide  isomorphism  for  arbitrary  graphs  [CFS+04,  BeE96]  and  many  more  are  frequently 
proposed  [KuS07,  JuK07].  The  traditional  application  used  to  decide  graph  isomorphism, 
nauty  [McK04],  is  often  used  to  provide  a  performance  baseline  to  assess  new  algorithms 
that  decide  graph  isomorphism.  The  key  tools  applied  in  nauty  [McK81]  are  extensively 
described  by  Bahai  [Bab95],  Kocay  [Koc96],  and  Kreher  and  Stinson  [KrS98].  In  nauty, 
the  key  goal  is  to  find  a  canonical  isomorph,  and  in  so  doing,  find  the  orbit  partition. 

Applications  that  are  similar  to  nauty,  such  as  nice  [Mil07],  perform  a  depth- first 
search  of  a  graph’s  vertex  partition  tree  to  identify  a  canonical  isomorph.  However,  many 
methods  are  used  to  dramatically  prune  this  partition  tree.  For  example,  the  house  graph 
contains  five  vertices,  thus,  its  original  partition  tree  contains  5!  =  120  leaves.  Applying 
the  coarsest  equitable  partition  and  vertex  individualization  yields  the  pruned  4-leaf  tree 
shown  in  Figure  26,  where  the  equitable  partitions  are  boxed.  Since  individualizing  block 
[c,d^  yields  discrete  partitions,  it  is  not  necessary  to  individualize  block  {6,e}. 


Unit  Partition 


Figure  26.  House  Graph’s  Vertex  Partition  Tree  (Equitable  Partitions  Boxed) 
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The  next  technique  prunes  the  partition  tree  by  comparing  partial  permutations, 
which  compares  the  binary  value  induced  by  the  partition’s  leading  discrete  blocks.  That 
method  extends  the  concept  of  retaining  the  minimum  discovered  isomorph  described  in 
Section  2.2.4.  For  example,  the  house  graph  and  its  corresponding  adjacency  matrix  are 
shown  in  Figure  27  and  Table  14(a),  respectively. 

Individualizing  the  two  vertices,  c  and  d,  as  shown  in  Figure  26  yields  the  two 

non-equitable  partitions,  [{c},  {J},  {a},  {6,e}]  and  [{J},  {c},  {a},  {&,e}],  respectively. 
The  leading  discrete  blocks  in  each  partition  is  [{c} ,  {<7} ,  {a}]  and  [{ ,  {c} ,  {a}] ,  thus 


inducing  the  partial  permutations,  [3,4,l]  and  [4,3,l],  respectively.  The  resulting  partial 

isomorphs  are  shown  in  Tables  14(b)  and  (c),  respectively.  Since  the  upper-right  triangles 
in  both  matrices  are  identical,  neither  branch  can  be  eliminated  in  this  example,  thus,  both 
branches  of  the  pruned  vertex  partition  tree  must  be  traversed. 


0 - @ 

Figure  27.  House  Graph 


Table  14.  Partial  Permutations  of  the  House  Graph’s  Adjacency  Matrix 
WA  (b)  A,=4.,|(A)  (c)A,=4„,(A) 


d 

C 

a 

d 

0 

1 

0 

c 

1 

0 

0 

a 

0 

0 

0 

a 

b 

C 

d 

e 

a 

0 

1 

0 

0 

1 

b 

1 

0 

1 

0 

1 

c 

0 

1 

0 

1 

0 

d 

0 

0 

1 

0 

1 

e 

1 

1 

0 

1 

0 

c 

d 

a 

C 

0 

1 

0 

d 

1 

0 

0 

a 

0 

0 

0 
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Since  individualizing  vertex  c  ov  d  does  not  yield  a  discrete  partition  in  the  vertex 
partition  tree  shown  in  Figure  26,  both  leaves  are  further  refined.  The  first  leaf  is  obtained 
by  individualizing  vertex  c  and  refining  to  the  coarsest  equitable  partition,  which  yields 

[{c} ,  { J} ,  {a} ,  {&} ,  {e}]  and  induces  the  permutation,  (j)  =  [3, 4, 1, 2, 5] .  Similarly,  the  leaf 
obtained  by  individualizing  vertex  d  and  refining  to  a  coarsest  equitable  partition  yields 
[{<i} ,  {c} ,  {a} ,  {e} ,  {&}]  and  induces  the  permutation,  ^  =  [4, 3, 1, 5, 2] .  Concatenating  the 


entries  contained  in  the  first  isomorph’s  upper-right  triangle’s  columns  yields  the  binary 
value,  10010101  llj  =599jo,  where  the  other  isomorph  also  yields  10010101  llj  =  599jo. 


Since  both  isomorphs  yield  the  same  binary  value,  these  isomorphs  are  automorphs. 

In  this  example,  the  leaves  that  remain  in  the  pruned  vertex  partition  tree  induce 
permutations  that  are  also  automorphisms.  Alternatively  stated,  each  leaf,  or,  permutation, 
yields  the  same  isomorph,  which  is  designated  as  the  house  graph’s  canonical  isomorph. 
In  sum,  applying  the  coarsest  equitable  partition,  individualization,  and  selection  of  the 
minimum  remaining  isomorph  reduced  the  search  space  for  canonical  isomorph  of  the 
house  graph  from  5!  =  120  potential  permutations  (leaves)  to  two  permutations  (leaves). 
However,  the  techniques  applied  thus  far  have  not  addressed  computing  the  orbit  partition 
while  conducting  this  depth-first  search. 

Table  15.  House  Graph  Isomorphs 


(a)  Aj  ^3,44,2,5] 


c 

d 

a 

b 

e 

c 

0 

1 

0 

1 
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d 

1 

0 
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0 

1 

a 

0 
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0 

1 

1 

b 

1 

0 

1 

0 

1 

e 

0 

1 

1 

1 

0 

(b)  A2  ^4,34,5,2] 


d 

C 

a 

e 

b 

d 

0 

1 

0 

1 

0 

c 

1 

0 

0 

0 

1 

a 

0 
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0 

1 

1 

e 

1 

0 

1 

0 

1 

b 

0 

1 

1 

1 

0 
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The  last  technique  that  is  used  to  prune  the  partition  tree  yields  the  orbit  partition. 
However,  this  technique  is  the  most  complex  method  applied,  and  similarly,  it  is  the  most 
difficult  to  implement  correctly.  Recalling  the  previous  example,  it  was  observed  that  the 
permutations  induced  by  the  leaves  remaining  in  the  partition  tree  illustrated  in  Figure  26 
yield  the  same  isomorph  of  the  house  graph,  i.e.,  they  are  relative  automorphisms. 

The  first  leaf  contains  the  discrete  vertex  partition,  [{c} ,  {t/} ,  {a} ,  {6} ,  {e}] ,  thus 

inducing  the  permutation,  (f)  =  [3, 4, 1,2, 5].  Similarly,  the  second  leaf  contains  the  discrete 

partition,  [|J } ,  {c} ,  {a} ,  {e} ,  {6}] ,  inducing  the  permutation,  (j)  =  [4, 3, 1, 5, 2] .  Inspection 

reveals  the  fundamental  difference  between  the  two  permutations  is  the  position  exchange 
with  respect  to  vertices  c  and  d,  or  similarly,  vertices  b  and  e.  Thus,  vertices  c  and  d  are 
located  in  the  same  orbit  and  vertices  b  and  e  are  located  in  the  same  orbit.  Conversely, 
vertex  a  is  not  located  in  the  orbit  of  any  other  vertex,  thus,  it  is  the  only  vertex  contained 
in  its  block  of  the  house  graph’s  coarsest  equitable  partition.  Therefore,  the  house  graph’s 

orbit  partition  is  {a}, 

The  management  of  such  automorphisms  may  enable  pruning  several  branches  in 
the  partition  tree,  although  that  it  is  not  true  in  this  example.  The  use  of  such  group  theory 
machinery,  e.g.,  orbits,  stabilizers,  the  automorphism  group,  the  Orbit- Stabilizer  theorem, 
and  the  Schreier-Sims  algorithm,  are  applied  in  methods  that  use  automorphisms  to  prune 
the  vertex  partition  tree  [KrS98,  Hof82,  Mar02,  Ser02].  However,  even  after  applying  all 
three  tools  of  pruning  the  partition  tree:  coarsest  equitable  partitions,  partial  permutations, 
and  automorphisms,  applications  similar  in  design  to  nauty  still  require  exponential  time 
to  find  the  orbit  partition  or  a  canonical  isomorph  of  certain  graphs  [Miy96]. 
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2.3.4.3.  Easy,  Hard,  and  In-Between  Graphs 

A  rigorous  method  of  classifying  an  arbitrary  graph  with  respect  to  the  difficulty  it 

causes  algorithms  that  decide  graph  isomorphism,  such  as  nauty,  is  not  known  to  exist. 
However,  various  subjective  notions  of  measuring  relative  graph  difficulty  are  often  used. 
For  instance,  almost  every  random  graph  is  considered  an  easy  graph,  since  the  coarsest 
equitable  partition  of  almost  all  random  graphs  is  discrete.  Alternatively  stated,  applying 
the  coarsest  equitable  partition  to  an  arbitrary  random  graph  prunes  its  partition  tree  from 

having  n\  leaves  to  a  single  leaf,  yielding  a  canonical  isomorph  in  0{n^  -logn)  time. 


In  contrast,  the  coarsest  equitable  partition  of  a  non-trivial  graph  contains  at  least 
one  block  composed  of  multiple  vertices,  i.e.,  a  non-discrete  partition.  In  the  extreme,  the 
coarsest  equitable  partition  may  only  contain  one  block,  i.e.,  it  is  the  unit  partition.  Thus, 
an  arbitrary  graph  yields  a  coarsest  equitable  partition  that  is  a  discrete,  non-discrete,  or 
unit  partition,  e.g.,  the  graphs  illustrated  in  Figures  28(a)-(c),  respectively.  The  methods 
described  herein  for  improving  the  PageRank  algorithm’s  performance  can  be  applied  to 
graphs  whose  coarsest  equitable  partition  is  non-discrete,  e.g.,  the  house  or  cuneane  graph 
shown  in  Figures  28(b)  and  (c),  respectively. 


(a)  Mansion  Graph 
(discrete  partition) 

[W.W.W.W.M.W] 


(b)  House  Graph 
(non-discrete  partition) 

[_{c,d],{a],{b,e]\ 


(c)  Cuneane  Graph 
(unit  partition) 

[{a, h,. ..,/?}] 


Figure  28.  Easy,  Medium,  and  Hard:  Discrete,  Non-Discrete  and  Unit  Partitions 
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2.3.5.  Induced  Quotient  Graphs  and  Matrices 

An  equitable  partition,  such  as  the  coarsest  equitable  partition  and  orbit  partition, 

are  closely  related  to  the  eigenvalues  and  eigenvectors  of  the  graph’s  adjacency  matrix. 
The  concept  of  an  equitable  partition  was  first  defined  by  Schwenk  in  the  context  of  this 
relationship  [Sch74],  and  not  its  usage  as  a  tool  for  pruning  search  trees  in  algorithms  that 
decide  graph  isomorphism.  Sachs  defined  the  equivalent  graph  divisors  concept,  as  noted 
by  Cvetkovic,  Rowlinson  and  Simic  [CRS97].  As  its  origin  suggests,  equitable  partitions 
can  be  used  to  explore  a  graph’s  eigensystem.  In  particular,  if  some  equitable  partition  is 
non-discrete,  portions  of  a  graph’s  eigensystem  can  be  explored  more  efficiently  using  the 
quotient  graph  induced  by  the  equitable  partition  [Hae95,  God93,  GoROl]. 

In  simple  terms,  an  equitable  partition  divides  the  eigenvalues,  or  equivalently,  the 
characteristic  polynomial,  of  the  graph’s  adjacency  matrix.  The  quotient  graph  induced  by 
the  equitable  partition  also  defines  an  adjacency  matrix  whose  eigenvalues  are  a  subset  of 
the  eigenvalues  yielded  by  the  graph’s  adjacency  matrix.  This  relationship  also  illustrates 
a  graph  can  possess  regular  structure  and  that  vertices  can  be  mapped  to  some  eigenvalue 
or  eigenvector  entry.  This  idea  has  also  been  used  as  the  basis  of  algorithms  that  decide 
graph  isomorphism  [CRS97,  HZL05]. 

The  quotient  graph  induced  by  an  arbitrary  equitable  partition,  e.g.,  the  coarsest 
equitable  partition  or  orbit  partition  of  an  arbitrary  graph,  replaces  the  vertices  contained 
in  a  given  block  of  the  partition  with  one  vertex  in  the  quotient  graph.  The  edges  between 
the  vertices  in  the  quotient  graph  correspond  to  the  number  of  neighbors  contained  in  the 
destination  block  with  respect  to  each  source  block  vertex.  Quotient  graphs  often  contain 
weighted  edges,  directed  edges,  and  loops,  i.e.,  they  typically  are  not  simple  graphs. 
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For  example,  the  house  graph  shown  in  Figure  29(a)  yields  the  coarsest  equitable 
partition,  [{a},{c,  J},{&,e}],  which  is  depicted  using  unique  shapes  in  Figure  29(b).  The 

partition  contains  three  blocks,  thus,  the  quotient  graph  contains  three  vertices,  where  one 
vertex  corresponds  to  each  block  contained  in  the  partition,  as  shown  in  Figure  29(c)  and 
reflected  in  the  corresponding  adjacency  and  quotient  matrices  listed  in  Table  16. 

Vertices  c  and  d  are  each  connected  to  one  vertex  in  block  {b,e^,  thus,  an  edge  of 

weight  ‘1’,  or  a  1-edge,  links  block  {c,d^  to  block  {b,e\  and  vice  versa.  Vertices  c  and  d 
are  connected  to  each  other,  as  are  vertices  b  and  e,  thus,  a  1-loop  is  attached  to  blocks 
{c,  J}  and  [b,eY  Since  vertex  a  is  connected  to  both  vertices  contained  in  block  {b,e\, 

a  2-edge  links  block  {a}  to  block  {6,e}.  Finally,  a  1-edge  connects  block  {b,e\  to  block 


{n} ,  since  vertices  b  and  e  each  have  one  neighbor  contained  in  block  [a] ,  namely,  a. 


(a)  House  Graph  (b)  Coarsest  Partition  (c)  Induced  Quotient  Graph 

Figure  29.  House  Graph’s  Induced  Quotient  Graph 

Table  16.  House  Graph’s  Induced  Quotient  Matrix 
(a)  A  (b)  Q 


destination 

{a} 

{b,  e} 

{c,  dj 

source 

{a} 

0 

1 

0 

{b,  e} 

2 

1 

1 

{c,  d} 

0 

1 

1 

a 

b 

C 

d 

e 

a 

0 

1 

0 

0 

1 

b 

1 

0 

1 

0 

1 

c 

0 

1 

0 

1 

0 

d 

0 

0 

1 

0 

1 

e 

1 

1 

0 

1 

0 
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2.3.5.I.  Interlacing  Eigenvalues 

Given  an  arbitrary  matrix,  M,  and  similarly  arbitrary  partition,  B,  the  eigenvalues 
of  the  induced  quotient  matrix,  Q,  interlace  M’s  eigenvalues.  That  is,  given  an  arbitrary 
matrix,  M"’",  and  an  arbitrary  partition,  B,  containing  m  blocks  and  the  induced  quotient 
matrix,  Q'”  '”,  Q’s  eigenvalue  is  less  than  or  equal  to  M’s  /*’’  eigenvalue  and  greater 
than  or  equal  to  M’s  eigenvalue,  i.e., 

^™(M)<1,(Q)<A(M),  (6) 

However,  if  the  partition,  B,  used  to  construct  Q  is  an  equitable  partition,  e.g.,  the 
coarsest  equitable  partition,  Q’s  eigenvalues  are  a  subset  of  M’s  eigenvalues.  Thus,  given 
an  arbitrary  eigenvalue  of  Q,  denoted  /1,.(Q),  there  exists  some  corresponding  value,/. 


such  that  /1,  (Q)  =  A^  (M).  In  certain  sources,  this  restricted  version  is  called  interlacing; 


whereas  generalized  interlacing  permits  B  to  be  non-equitable. 

For  example,  the  eigenvalues  of  the  house  graph  and  the  quotient  graph  induced 

by  its  coarsest  equitable  partition,  [{«},  are  listed  in  Tables  17(a)  and  (b). 


respectively.  Since  this  partition  is  equitable,  the  quotient  graph’s  eigenvalues  are  a  subset 
of  the  house  graph’s  eigenvalues,  as  shown  in  Table  17. 


Table  17.  House  Graph’s  and  Its  Induced  Quotient  Graph’s  Eigenvalues 


(a)  X,  House  Graph 


i 

I 

2 .4812 

2 

0.6889 

3 

0.0000 

4 

-1 . 1701 

5 

-2.0000 

(b)  X,  Induced  Quotient  Graph 


i 

I 

2 .4812 

2 

0.6889 

3 

-1.1701 
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2.3.5.2.  Lifting  Eigenvectors 

Some  eigenvectors  of  M  can  be  lifted  from  eigenvectors  of  the  quotient  matrix,  Q. 
This  relationship  is  based  on  the  characteristic  matrix,  B,  the  incidence  matrix  derived 
from  the  vertex  partition,  B,  used  to  induce  the  quotient  matrix,  Q.  Rows  in  B  correspond 
to  rows  and  columns  in  a  source  matrix,  M,  and  columns  in  B  correspond  to  blocks  in  B. 
For  example,  the  house  graph’s  coarsest  equitable  partition  yields  the  5x3  characteristic 
matrix,  or  block  matrix,  listed  in  Table  18. 

The  characteristic  matrix,  B,  is  the  basis  of  several  identities  with  respect  to  the 
original  and  quotient  graphs.  The  first  elementary  result  is 


N  =  B^-B,  (7) 

where  N  is  the  diagonal  matrix  reflecting  the  number  of  vertices  contained  in  the  blocks 
of  the  associated  coarsest  equitable  partition,  as  shown  in  Table  19(a).  Since  N  is  simply  a 
diagonal  matrix  containing  non-zero  diagonal  entries,  its  matrix  inverse,  N”',  is  obtained 


by  simply  reciprocating  N’s  diagonal  entries,  as  shown  in  Table  19(b). 


Table  18.  Block  Matrix,  B,  of  the  House  Graph’s  Coarsest  Equitable  Partition 


{a} 

{b,  e} 

{c,  d} 

a 

1 

0 

0 

b 

0 

1 

0 

c 

0 

0 

1 

d 

0 

0 

1 

e 

0 

1 

0 

Table  19.  Block  Matrix,  N,  of  the  House  Graph’s  Coarsest  Equitable  Partition 


(a)N  =  B".B,N„=2;B,, 


{a} 

{b,  ej 

{c,  d} 

{a} 

1 

0 

0 

{b,e} 

0 

2 

0 

{c,  d} 

0 

0 

2 

(b)  N-',N:'=1/N„ 


{a} 

{b,  ej 

{c,  d} 

{a} 

1 

0 

0 

{b,  ej 

0 

1/2 

0 

{c,  d} 

0 

0 

1/2 
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More  notably,  it  further  can  be  shown  that  [God93,  GoROl] 


A  B  =  B  Q. 


(8) 


By  left-multiplying  each  side  with  B^,  it  can  also  be  shown  that  Q  =  B^  •  A-B,  since 


B^  A  B  =  B^  B  Q 

A  B  =  (b^  b)  ‘  (b^  b)  Q 

,  J  (9) 

(b^  b)  b^  a  b  =  ^p^  q 

B^  A  B  =  Q, 


where  B^  denotes  the  pseudo-inverse  of  B,  i.e., 


B 


=  (^B^  •  b)  •  B^.  Given  an  eigenvalue, 


/I;,  and  an  eigenvector,  ^  0,  of  the  quotient  graph,  Q,  yielded  by  an  equitable  partition 
of  A,  it  can  be  shown  that  [Hae95] 


A  B  r,  =B  Q  r,  =A,  B  r,  (10) 

where  substituting  =  B  •  in  A  •  B  •  •  B  •  r,.  yields  A  •  =  /(.  •  x^. 

Thus,  this  identity  lifts  the  quotient  matrix  eigenvector,  r.,  to  some  eigenvector  of 
the  adjacency  matrix,  x^  =  B  r;.  Quotient  matrices  may  be  smaller  than  the  input  matrix, 

thus,  lifting  can  accelerate  finding  eigenvectors.  For  example,  the  quotient  matrix  induced 

by  the  house  graph’s  coarsest  equitable  partition  yields  the  dominant  eigenvector  listed  in 

Table  20(a).  Lifting  yields  the  house  graph’s  dominant  eigenvector  listed  in  Table  20(b). 

Table  20.  Lifting  a  Dominant  Eigenvector  of  the  House  Graph 
(a)  r,.  (b)x.  =B-r. 


{a] 

0.5555 

{b,  ej 

0.6892 

{c,  d) 

0.4653 

a 

0 

.5555 

b 

0 

6892 

c 

0 

.4653 

d 

0 

.4653 

e 

0 

.  6892 
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2.3.5.3.  Interlacing,  Lifting,  Arbitrary  Matrices  and  Arbitrary  Partitions 

Thus,  an  arbitrary  matrix,  M,  and  partition,  B,  of  M’s  rows  and  columns  yields  the 

block  matrix,  B,  and  the  quotient  matrix,  Q  =  •  M  •  B  =  (^B^  •  b)  •  B^  •  M  •  B.  Moreover, 

Q’s  eigenvalues  interlace  M’s  eigenvalues,  since  Q’s  eigenvalues  are  strictly  contained  in 
the  intervals  bounded  by  M’s  eigenvalues,  i.e., 

4-.,.,  (M)  <  ^,  (Q)  <  A,  (M),  1^,1  <  (11) 

where  Q  is  an  kxk  matrix,  M  is  an  nxn  matrix,  k<n,  and  b  =  k  =  \B\.  If  an  arbitrary 

partition,  B,  is  simply  an  equitable  partition,  such  as  M’s  coarsest  equitable  partition,  Q’s 
eigenvalues  are  a  subset  of  M’s  eigenvalues,  i.e.,  given  \  <i<k  and  1  <j<n, 

/l,(Q)  =  /l,(M).  (12) 

If  B  is  equitable,  given  some  arbitrary  eigenvalue  of  Q,  T,  (Q),  and  an  eigenvalue 
ofM,  /I^.(M),  such  that  MQ)  =  A.  (M),  an  eigenvector,  associated  with  T.  (Q),  is 
related  to  eigenvector,  x^.,  associated  with  T^  (M)  by  the  relationship, 

x,=B.r,.  (13) 

The  quotient  matrix,  Q  =  B^  M  B,  equals  M’s  corresponding  average  row  sums, 
i.e.,  where  B  =  ue.b.,  v&b.,  i,j<k  and  k<n  [Hae95].  If 

B  is  equitable,  the  summation,  Q.^.  simplifies  to  Q,.^.  such  that 

ueb.,  w&bj,  [u,wjeE,  and  z.  j  =  ^N[u)r\bj^,  the  number  of  r’s  neighbors  contained 
in  block  bj  [Hae95].  Therefore,  if  B  is  an  equitable  partition,  a  summation  of  as  many  as 
n  values  can  be  reduced  to  a  single  multiplication  operation. 
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2.4.  A  Brief  Interlude:  Eigen  Decomposition  Applications  in  Graph  Theory 

Eigen  decomposition  is  the  basis  of  several  fundamental  results  in  graph  theory.  A 

classic  example  is  that  the  number  of  components  in  the  graph,  denoted  (C),  equals  the 
multiplicity  of  ‘0’  in  the  eigenvalues  of  the  graph’s  Laplacian  matrix,  denoted  L,  where 
JciG)  =  (L)  =  O)  [Chu94].  The  Laplacian  is  constructed  by  subtracting  the  graph’s 

adjacency  matrix.  A,  from  its  degree  matrix,  D,  where  D^.  =deg(y.),  thus  L  =  D-A. 

Some  recent  results  are  based  on  applying  the  signless  Laplacian,  L  =  D  +  A  [HaS04]. 

The  eigen  decomposition  is  used  to  partition  graphs,  e.g.,  mapping  related  parallel 
tasks  to  some  pool  of  processors  [HeL93,  GuM95,  Got03].  Another  application  provides 
solutions  to  the  traveling  salesman  problem  [Moh04].  Eigenvalues  are  also  used  to  find 
the  rate  at  which  a  graph’s  stationary  distribution  is  reached,  e.g.,  to  assess  data  structure 
resiliency  [AsWOS]  or  determine  how  quickly  the  PageRank  vector  converges  [HaK03]. 
A  beautiful  application  uses  eigenvectors  as  vertex  coordinates  [KorOS].  For  example,  the 
drawing  of  the  Buckminsterfullerene  molecule,  Ceo,  or  buckyball,  illustrated  in  Figure  30 
is  based  on  three  eigenvectors  of  the  buckyball’s  signless  Laplacian. 


Figure  30.  3-D  Buckyball  Drawing  Based  on  Its  Signless  Laplacian’s  Eigenvectors 
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2.5.  The  PageRank  Algorithm 

Another  significant  application  of  the  eigen  decomposition  to  graph  theory  is  the 
PageRank  algorithm  used  in  certain  search  engines  to  order  query  responses,  such  as  the 
web  pages  matching  a  user’s  search  criteria  [PBM+98].  For  example,  the  mansion  graph 
shown  in  Figure  31(a)  yields  the  PageRank  vector  listed  in  Table  21(a),  which  is  a  unique 
eigenvector  that  always  exists,  as  described  in  this  section.  Sorting  the  PageRank  vector’s 
entries  yields  the  vector  listed  in  Table  21(b),  which  is  reflected  in  Figure  31(b). 

The  PageRank  vector  is  unique  up  to  isomorphism.  Since  every  entry  contained  in 
the  mansion  graph’s  PageRank  vector  is  distinct,  every  mansion  graph  isomorph  yields  a 
vertex  ordering  equivalent  to  the  vertex  ordering  shown  in  Table  21(b).  For  example,  the 
isomorph  shown  in  Figure  31(c)  yields  the  sorted  PageRank  vector  shown  in  Table  21(c), 
which  is  equivalent  to  the  ordering  shown  in  Table  21(b)  and  reflected  in  Figure  31(b). 


(a)  Mansion  Graph  (b)  PageRank  Order  (c)  Another  Isomorph 


Figure  3 1 .  Mansion  Graph 

Table  21.  Mansion  Graph’s  PageRank  Vector 
(a)  PageRank  Vector  (b)  PageRank  Order  (c)  Isomorph’s  Order 


0.236 

b 

0.195 

c 

0.182 

d 

0.180 

e 

0.126 

a 

0.080 

f 

0.236 

a 

0.195 

b 

0.182 

c 

0.180 

d 

0.126 

e 

0.080 

f 

a 

0  , 

.126 

b 

0, 

.236 

c 

0  , 

.195 

d 

0  , 

.  182 

e 

0  , 

.  180 

f 

0  , 

.  080 
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Conversely,  if  the  PageRank  vector  contains  duplicate  entries,  it  cannot  induce  a 
canonical  ordering.  That  issue  can  be  resolved  in  some  applications,  e.g.,  search  engines, 
by  sorting  on  more  keys,  e.g.,  web  page  addresses.  Alternatively,  ties  in  PageRank  value 
can  be  broken  by  the  vertex  order  in  a  canonical  isomorph  produced  by  applications  such 
as  nauty  (cf.  Sections  1.2  and  2. 3. 4.2). 

For  example,  the  house  graph  depicted  in  Figure  32(a)  yields  the  PageRank  vector 
listed  in  Table  22(a).  The  house  graph’s  canonical  isomorph  yielded  by  nauty  is  shown  in 
Figure  32(b)  and  corresponds  to  applying  the  permutation,  [3, 4, 1,2, 5],  where  the  vertex 


mapping,  j  r  denotes  vertex  v.,  labeled  r,  is  mapped  to  vertex  Vj,  labeled  s. 


The  two  graphs  are  isomorphs,  thus,  their  PageRank  vectors  are  also  related  by  the  same 
vertex  permutation,  as  reflected  in  Tables  22(a)  and  (b),  where  the  canonical  isomorph 


induces  the  canonical  vertex  ordering  listed  in  Table  22(c)  and  shown  in  Figure  32(c). 


(b)  Canonical  Isomorph  (c)  PageRank  Order 

Figure  32.  House  Graph 


(a)  House  Graph 


a 

0.168 

b 

0.244 

c 

0 . 172 

d 

0 .172 

e 

0.244 

Table  22.  House  Graph’s  PageRank  Vector 

(b)  Canonical  Isomorph  (c)  PageRank  Order 


1 

3 

a  — >  c 

0, 

.168 

2 

4 

b  — ^  d 

0  , 

.244 

3 

1 

c  ^  a 

0  , 

.  172 

4 

2 

d  — ^  b 

0  , 

.  172 

5 

5 

e 

0  , 

.244 

0.244 

d 

0.244 

e 

0 . 172 

a 

0 . 172 

b 

0 .168 

c 
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However,  finding  a  canonical  isomorph  of  certain  graphs  may  require  exponential 
time  [Miy96].  Access  to  a  canonical  isomorph  also  does  not  reduce  any  numerical  errors 
in  the  PageRank  vector  if  the  vector  is  obtained  using  standard  finite-precision  arithmetic, 
e.g.,  as  is  specified  in  the  IEEE  754  standard  [ISB85].  The  three  algorithms  described  in 
Chapters  4  and  5  eliminate  a  certain  class  of  numerical  errors  that  may  be  present  in  the 
PageRank  vector.  More  importantly,  two  of  these  algorithms  dramatically  reduce  the  time 
needed  to  compute  the  PageRank  vector  of  certain  graphs. 

To  obtain  this  vertex  order,  the  PageRank  algorithm  perturbs  the  adjacency  matrix 
and  builds  a  strictly  positive  stochastic  matrix  in  which  each  entry  reflects  the  probability 
of  visiting  a  vertex  from  another  vertex.  Assuming  the  probabilities  are  independent,  each 
entry  reflects  the  probability  of  transitioning  between  states  in  a  Markov  chain.  Applying 
the  Perron-Frobenius  and  Perron  theorems  establishes  the  matrix  must  yield  the  dominant 
eigenvalue,  one.  Normalizing  the  unique  and  associated  dominant  eigenvector  yields  the 
Markov  chain’s  stationary  distribution,  where  an  entry  reflects  the  probability  of  visiting 
some  vertex,  e.g.,  nodes  in  a  UAV  swarm  [PBM-i-98,  LaM03,  LaM06,  PSC05]. 

Thus,  the  eigenvector  orders  a  graph’s  vertices  based  on  PageRank  values  yielded 
by  its  perturbed  adjacency  matrix.  PageRank  values  are  unique  up  to  isomorphism,  i.e., 
vertices  have  the  same  PageRank  value  across  all  input  graph  isomorphs.  Hence,  vertices 
contained  in  the  same  block  of  the  orbit  partition  also  must  have  equal  PageRank  values. 
In  other  words,  given  vertices  u  and  v  such  that  a  set  of  automorphisms  maps  u  — >  v  and 
an  equal  number  maps  m  — >  v,  the  vertices,  u  and  v,  must  have  equal  PageRank  values. 
The  proof  developed  in  Section  3.4  shows  the  same  result  holds  for  the  coarsest  equitable 
partition,  i.e.,  vertices  contained  in  the  same  block  must  have  equal  PageRank  values. 
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2.5.1.  Computing  the  PageRank  Perturbation 

The  first  step  in  obtaining  the  PageRank  vector  is  to  perturb  the  adjacency  matrix, 

A,  to  obtain  a  positive  column-stochastic  matrix,  S,  i.e.,  S,,  >0,V/,y  and 

The  PageRank  perturbation  applies  the  degree  matrix,  D,  used  to  obtain  several  stochastic 
perturbations  [Sin64,  SiK67,  HZL05],  where  equals  the  degree  of  vertex  y.,  i.e., 

Jo  j 

)  ^  =  J  ' 

If  A  is  an  adjacency  matrix  of  a  connected  graph,  D’s  diagonal  entries  are  strictly 
non-zero.  Hence,  its  inverse,  denoted  D“',  is  found  by  reciprocating  D’s  diagonal  entries, 
i.e.,  D-'  =1/D,.„  Vi.  For  example,  the  house  graph’s  adjacency  matrix  shown  in  Table  23 
yields  the  degree  and  inverse  matrices  listed  in  Table  24.  D’s  diagonal  entries  correspond 
to  the  graph’s  vertex  degree  partition,  which  for  the  house  graph  is  ^[a,c,d], 

Table  23.  House  Graph’s  Adjacency  Matrix,  A 


a 

b 

C 

d 

e 

a 

0 

1 

0 

0 

1 

b 

1 

0 

1 

0 

1 

c 

0 

1 

0 

1 

0 

d 

0 

0 

1 

0 

1 

e 

1 

1 

0 

1 

0 

Table  24.  House  Graph’s  Degree  Matrix  and  Degree  Matrix  Inverse 
(a)D  (b)D-' 


a 

b 

C 

d 

e 

a 

2 

0 

0 

0 

0 

b 

0 

3 

0 

0 

0 

c 

0 

0 

2 

0 

0 

d 

0 

0 

0 

2 

0 

e 

0 

0 

0 

0 

3 

a 

b 

C 

d 

e 

a 

1/2 

0 

0 

0 

0 

b 

0 

1/3 

0 

0 

0 

c 

0 

0 

1/2 

0 

0 

d 

0 

0 

0 

1/2 

0 

e 

0 

0 

0 

0 

1/3 
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A  row-stochastic  matrix,  S,  is  a  non-negative  matrix  whose  rows  each  sum  to  one, 
i.e.,  ^S.  .  =1.  Given  an  arbitrary  adjacency  matrix.  A,  and  its  degree  matrix,  D,  obtained 
by  summing  A’s  rows,  a  row-stochastic  matrix,  S,  can  be  obtained  by  computing 

S  =  D~'-A.  (15) 

A  column-stochastic  matrix,  S,  is  a  non-negative  matrix  whose  columns  sum  to 

one,  i.e.,  2:s,=i.  Thus,  summing  A’s  columns  to  obtain  a  degree  matrix,  D  enables 

constructing  a  column-stochastic  matrix,  S,  by  inverting  the  multiplication  order,  where 

S  =  A-D-‘.  (16) 

For  example,  applying  the  row-stochastic  perturbation,  (15),  to  the  adjacency  and 
degree  matrices  enumerated  in  Tables  23  and  24,  respectively,  yields  the  row- stochastic 
matrix  listed  in  Table  25.  Conversely,  applying  the  column-stochastic  perturbation,  (16), 
to  these  same  matrices  yields  the  column-stochastic  matrix  listed  in  Table  26. 


Table  25.  A  Row-Stochastic  Matrix,  ^  S  (/, :)  =  1 


0.0 

0.5 

0.0 

0.0 

0.5 

1 

0.3 

0.0 

0.3 

0.0 

0.3 

0.0 

0.5 

0.0 

0.5 

0.0 

0.0 

0.0 

0.5 

0.0 

0.5 

0.3 

0.3 

0.0 

0.3 

0.0 

0.6 

1.3 

0.8 

0.8 

1.3 

I 

Table  26.  A  Column-Stochastic  Matrix,  ^S(:,y)  =  1 


0.0 

0.3 

0.0 

0.0 

0.3 

0.6 

0.5 

0.0 

0 . 5 

0.0 

0.3 

1.3 

0.0 

0.3 

0.0 

0 . 5 

0.0 

0.8 

0.0 

0.0 

0.5 

0.0 

0.3 

0.8 

0.5 

0.3 

0.0 

0 . 5 

0.0 

1.3 

1 

Z 
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The  perturbation  applied  in  the  PageRank  algorithm  modifies  the  original  column- 
stochastic  perturbation,  (16),  by  applying  a  scaling  factor,  ae[0.0,1.0],  and  a  shifting 

factor,  {\-a)jn,  where  n  =  |F|,  the  number  of  vertices.  The  stochastic  PageRank  matrix 
perturbation,  or  scaled  and  shifted  modification  of  (16),  is 

S  =  a-A-D-‘+(l-a)/«,  (17) 

where  A  and  D  denote  a  graph’s  adjacency  and  degree  matrices,  respectively  [PBM-i-98] 
and  1  -  a  is  denoted  d,  yielding  S  =  a  •  A  •  D“'  +5ln.  If  a  =  l,  the  PageRank  perturbation 

reduces  to  the  column-stochastic  perturbation  (16),  S  =  A  D”'.  In  essence,  decreasing  a, 
and  thus  increasing  S,  causes  S’s  entries  to  be  less  dependent  on  the  value  of  A’s  entries. 
In  the  opposite  extreme,  if  «  =  0,  S  =  l  and  S,.^.  =  The  PageRank  algorithm’s 

developers  suggest  a  scaling,  or  damping  factor  of  a  =  0.85,  hence  =  0.15  [PBM-i-98]. 

For  example,  the  adjacency  matrix  of  the  house  graph  listed  in  Table  23  yields  the 
degree  and  inverse  matrices  listed  in  Table  24.  Applying  the  PageRank  perturbation,  (17), 
to  these  matrices,  where  a  =  0.85  and  =  0.15,  yields  the  column- stochastic  matrix,  i.e., 
the  PageRank  matrix,  listed  in  Table  27.  As  this  example  demonstrates,  all  of  the  columns 
in  the  PageRank  matrix  sum  to  one,  whereas  its  rows  typically  do  not  sum  to  one. 


Table  27.  House  Graph’s  PageRank  Matrix,  S,  or  =  0.85 


a 

b 

c 

d 

e 

a 

0.0300 

0.3133 

0 . 0300 

0 . 0300 

0.3133 

0.7167 

b 

0.4550 

0.0300 

0.4550 

0.0300 

0.4550 

1.2833 

c 

0.0300 

0.3133 

0.0300 

0.4550 

0.0300 

0.8583 

d 

0.0300 

0.0300 

0.4550 

0 . 0300 

0.4550 

0.8583 

e 

0.4550 

0.3133 

0.0300 

0.4550 

0.0300 

1.2833 

1 

I 
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Applying  the  scaling  and  shifting  modifications,  a  and  <5,  to  the  column- stochastic 
perturbation,  (16),  satisfies  two  objectives.  First,  the  perturbation  satisfies  the  assumption 
that  any  vertex,  or  web  page,  can  be  randomly  visited.  This  assumption  is  often  called  the 
random  surfer  model  [LaM03].  Moreover,  the  column-stochastic  PageRank  matrix,  S,  is  a 
primitive  matrix,  where  a  matrix,  M,  is  primitive: 

1.  if  M  is  irreducible  and  M  contains  one  or  more  positive  diagonal  entries, 

2.  or  equivalently,  if  and  only  if  >  0  for  some  k>Q. 

The  PageRank  perturbation,  and  more  specifically,  the  shifting  factor,  Sjn,  forces  every 
entry  in  S  to  be  strictly  positive,  i.e.,  each  entry  is  strictly  positive,  where  >  0,  V  A:  >  0. 


The  PageRank  matrix,  S,  is  also  irreducible,  where  an  arbitrary  matrix,  M,  is  irreducible 
if  given  any  permutation  matrix,  P,  the  matrix,  M,  does  not  yield  an  isomorph  such  that 


P  M  P 


X  Y 

0  z 


(18) 


where  X,  Y,  and  Z  are  square  matrices.  The  PageRank  matrix,  S,  is  irreducible  since  each 
entry  is  strictly  positive,  i.e.,  S  is  constructed  such  that  none  of  its  entries  equal  zero. 

An  arbitrary  PageRank  matrix,  S,  is  irreducible  and  satisfies  the  Perron-Frobenius 
theorem’s  conditions  [LaM06].  Moreover,  S  is  also  a  primitive  matrix  and  satisfies  the 
conditions  of  the  Perron  theorem,  which  is  a  more  powerful  theorem  [LaM06].  Applying 
either  theorem  shows  that  the  PageRank  matrix,  S,  yields  a  positive  eigenvalue  associated 
with  an  eigenvector  unique  up  to  isomorphism  and  scaling,  where  eigenvectors  are  often 
not  unique.  However,  applying  the  Perron  theorem  establishes  it  is  the  unique  dominant 
eigenvalue,  i.e.,  the  eigenvalue  having  the  largest  magnitude.  Similarly,  the  eigenvector  is 
the  unique  dominant  eigenvector  defining  the  graph’s  stationary  distribution. 
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A  primitive,  irreducible,  and  stochastic  PagcRank  matrix,  S,  reflects  the  transition 
probabilities  between  pairs  of  states  in  an  aperiodic  Markov  chain,  a  memoryless  random 
process  where  the  probability  of  entering  the  next  state  only  depends  on  the  current  state. 
In  an  aperiodic  Markov  chain,  any  state  may  be  entered  during  any  transition,  a  property 
satisfied  by  S  being  irreducible  and  primitive  [LaM06].  S  is  constant  after  the  PagcRank 
perturbation  (16)  is  applied,  i.e.,  all  of  S’s  transition  probabilities  remain  constant,  thus,  S 
represents  a  stationary  Markov  chain.  Therefore,  S  yields  a  unique  stationary  probability 
distribution,  Xj,  where  for  all  i,  =  S  -  x^,  and  there  exists  some  value,/,  such  that 

x.=S-x,.  (19) 

Given  that  the  PagcRank  matrix,  S,  is  primitive,  irreducible,  and  stochastic,  where 
S  represents  the  transition  probabilities  of  a  stationary  aperiodic  Markov  chain,  it  can  be 
shown  that  its  dominant  eigenvalue  equals  one  [LaM06].  More  importantly,  the  dominant 
eigenvector,  x,  reflects  the  stationary  distribution  of  a  memoryless  random  process,  or  the 
unique  probability  a  random  surfer  visits  each  vertex.  This  stationary  result  is  guaranteed 
by  the  delta  shift  value,  S,  which  ensures  each  vertex  has  a  nominal  probability  of  being 
visited.  Alternatively  stated,  applying  the  additive  delta  shift  value,  S,  ensures  each  vertex 
can  be  reached  from  any  other  vertex,  or  in  the  context  of  a  Markov  chain,  that  any  state 
can  be  randomly  reached  from  any  other  state. 

Finally,  and  most  significantly,  the  unique  dominant  eigenvector,  x,  produced  by 
the  PagcRank  matrix,  S,  is  unique  up  to  graph  isomorphism.  Thus,  an  arbitrary  vertex,  v., 

has  the  associated  PagcRank  value,  x,. ,  with  respect  to  any  permutation  matrix,  P,  where 

P  A  P^  P  S  P^  P  X.  (20) 
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2.5.2.  Computing  the  PageRank  Vector 
2.5.2. 1.  Power  Method  Iteration 

There  are  many  ways  to  find  the  PageRank  vector,  e.g.,  by  applying  MATLAB’s 
‘eig’  function  for  finding  eigen  decompositions.  Another  method  is  the  power  method, 
where  PageRank  matrix,  S,  is  multiplied  by  the  current  PageRank  vector,  x^,  yielding  the 

revised  estimate  of  the  PageRank  vector,  x.^j  [LaM06].  The  vector,  x,.^j,  is  normalized 
by  applying  an  arbitrary  norm,  e.g.,  x.^j  <—  x,._^j/||x,.^j|  =  .  Thus,  each  iteration 

simply  requires  computing  a  normalized  dot  product,  where 

x^^i-^S-x.,  (21) 

followed  by 

(22) 

The  initial  entries  in  x  can  equal  an  arbitrary  value,  such  that  Xj  (/)  e  [0. 0,1.0]  and 
^Xj  =1,  e.g.,  Xj  =  !"  '/«.  The  power  method  terminates  after  some  number  of  iterations 
is  performed  or  some  numerical  tolerance,  r,  is  obtained,  e.g.,  ^|x;^j  -x^j  <  r,  r  >  0.  The 

power  method  derives  its  name  from  the  observation  the  iteration  can  be  expressed  by 
the  product  of  the  power  of  S,  denoted  S',  and  the  initialization  vector,  x^,  where 

V  ^[S-Xo]/[XS-Xo] 

x^  ^  [S  •  X  J/[X  S  •  X,  ]  =  [S  •  (S  •  x„  )]/[X  S  •  (S  •  x„ )] 

X.  ^[S'-Xo]/[XS''-Xo]. 

Finally,  S’s  left  eigenvector,  y,  is  obtained  by  computing  y,.^j  yf  S,  followed  by  the 
requisite  normalization,  y.^j  ^  y,>i/Xyw  • 
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2.5.2.2.  Expected  Number  of  Power  Method  Iterations 

The  upper  bound  on  the  number  of  power  method  iterations  required  to  obtain  the 

PageRank  vector,  x,  to  some  precision  r,  e.g.,  r  <  0.001,  is  based  on  the  convergence  rate, 
the  rate  that  c'  — >  0  for  some  value  c.  The  value,  c,  is  bounded  by  the  magnitude  ratio  of 
S’s  two  most  dominant  eigenvalues,  i.e.,  c<|/^2|/|^|-  S  is  a  stochastic  matrix,  therefore, 

\=\,  which  yields  c  <  ITjI*  It  has  also  been  shown  that  the  second  dominant  eigenvalue 


is  bounded  by  the  scaling  value,  a,  thus,  c<|A2|^<^  [HaK03].  Since  c'  <r,  the  upper 

bound  on  the  number  of  power  method  iterations,  t,  needed  to  obtain  the  PageRank  vector 
using  floating-point  arithmetic  computed  in  some  base,  b,  is  [G0V88,  HaK03,  LaM06] 

t  <  I  <  ^  t<  log,, .  ^1 T  <  log^  T .  (24) 

logjC  log^|/l2(S)|  log^a 

For  example,  if  a  PageRank  vector  is  found  using  binary  floating-point  arithmetic, 
b  =  2.  Assuming  a  =  0.85  and  r  <  0.001, 

,  ,  logft  r  login  0.001  log,  0.001 

t  <  log.  I,,.,  T  <  log„  T  =  — ^  - =  42.5043 .  (25) 

log^  a  login  0-^5 

Therefore,  performing  43  power  method  iterations  ensures  the  PageRank  vector  is  correct 
to  three  decimal  digits,  or  equivalently,  approximately  ten  bits. 

For  many  graphs,  logj  n  power  method  iterations  often  appears  to  yield  sufficient 

precision  in  the  PageRank  vector  [PBM+98].  However,  a  formal  lower  bound  has  not  yet 
been  established  on  the  number  of  iterations  required  to  obtain  the  required  precision,  r. 
As  shown  in  Section  3.2,  assuming  T<l/n  and  a  >  0.5,  a  practical  lower  bound  on  the 
number  of  necessary  power  method  iterations,  t,  is  log2  n,  i.e.,  log2  n<t. 
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2.5.3.  PageRank:  An  Algorithm  for  Ranking  Vertices 

The  PageRank  algorithm  consists  of  two  key  steps,  applying  a  perturbation  to  the 

graph’s  adjacency  matrix  that  yields  a  strictly  positive  stochastic  matrix  and  determining 
the  dominant  eigenvector  of  the  stochastic  matrix  by  computing  an  iterated  dot  product. 
The  eigenvector’s  entries  correspond  to  the  probability  the  corresponding  vertices  will  be 
visited  by  an  object  that  randomly  selects  its  destination.  The  entire  PageRank  algorithm 
is  listed  in  Figure  33,  where  the  stochastic  PageRank  perturbation  is  applied  on  lines  2-6. 
Since  the  power  method  terminates  based  on  the  numerical  differences  with  respect  to  the 
last  eigenvector  estimate,  the  two  vectors,  s  and  x,  are  initialized  on  lines  8  and  9,  along 
with  an  iteration  counter,  z,  on  line  10. 

The  power  method  loop  is  entered  on  line  12  and  the  eigenvector  estimate,  x,  is 
copied  to  the  vector,  s,  on  line  14.  The  eigenvector  is  revised  on  line  16,  where  x  =  S  •  x. 
Normalizing  by  the  column  sum  norm  on  line  16,  or  1-norm,  ||x| ,  ensures  x  sums  to  one, 
i.e.,  that  x  is  a  probability  distribution.  The  norm  choice  is  arbitrary,  e.g.,  the  Euclidean, 
or  spectral  norm,  Ixl^ ,  could  be  used.  The  power  method  terminates  after  the  tolerance,  r, 

is  obtained  and  the  upper  bound  on  the  number  of  power  method  iterations  is  t  =  log^  r. 

The  PageRank  perturbation  requires  time,  where  «  =  |F|,  since  each  entry 

in  A  is  scaled  and  shifted.  Given  a  tolerance,  r,  and  scaling  factor,  a,  the  lower  and  upper 
bounds  on  the  power  method’s  complexity  are  logn),  and  -t),  respectively, 

where  t  =  log^  t.  If  the  graph  is  stored  using  sparse  matrices,  the  lower  and  upper  bounds 
can  be  reduced  to  Q(mTogn)  and  O(mT),  respectively,  where  m  =  |£'|,  the  number  of 
edges  contained  in  the  graph. 
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A  common  modification  of  the  PageRank  algorithm  uses  a  personalization  vector, 
V,  containing  user-specified  probabilities.  A  potential  application  is  “to  decrease  the  effect 
of  spamming  done  by  the  so-called  link  farms”  [LaM06].  The  personalization  vector,  v,  is 
used  to  modify  the  perturbation  on  line  6  in  Figure  33,  which  becomes  [LaM03,  LaM06] 

S  =  a-A-D-‘+(l-«)-vl‘’".  (26) 

1 .  getPageRank  ( A,  «,  a,  r) 

2.  #  construct  degree  matrix 

4.  D<— diag(d) 

5 .  #  apply  PageRank  perturbation 

6.  — (Z‘A‘D  (^1  —  cc^jn 

1.  #  initialize  vectors  and  counter 

8.  X  ^  1"7« 

9.  s  <-  0"’‘ 

10.  z<— 0 

11.  #  iterate  power  method 

12.  while(^||s-x||2  >  r) 

13.  #  save  PageRank  vector 

14.  s<— X 

15.  #  update  PageRank  vector 

16.  x<— S-x 

17.  #  normalize  PageRank  vector 

18.  x<-x/^x 

19.  #  increment  loop  counter 

20.  z  ^ —  z  "1"  1 

2 1 .  end  while 

22.  return  x 

23.  end  PageRank 

Figure  33.  PageRank:  An  Algorithm  for  Ordering  Vertices  [PBM+98] 
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The  following  example  applies  the  PageRank  algorithm  to  the  paw  graph  shown 
in  Figure  34  [WesOl].  The  eorresponding  adjaceney,  degree,  and  inverse  degree  matrices 
are  listed  in  Tables  28(a)-(c),  respectively.  The  perturbation  on  lines  2-5  yields  the  graph 
shown  in  Figure  35,  where  edges  created  by  the  perturbation  are  depicted  as  dotted  lines 
and  the  corresponding  perturbed  matrix  is  listed  in  Table  28(d).  The  initial  PageRank  and 
history  vectors,  and  Sp  =  O'*’',  are  shown  in  Tables  29(a)  and  (b),  respectively. 

The  revised  vectors  produced  by  executing  the  first  power  method  iteration  are  shown  in 
Tables  29(c)  and  (d),  respectively.  The  normalization  step  performed  on  line  17  does  not 
change  x  in  this  particular  example. 


Figure  34.  Paw  Graph  [WesOl] 


0.04 


0.04 


0.46 


0.04 


Figure  35.  Applying  the  PageRank  Perturbation  to  the  Paw  Graph,  a  =  0.85 
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Table  28.  Paw  Graph’s  Adjacency  and  PagcRank  Matrices,  a  =  0.85 
(a)  A  (b)  D  (c)  D'* 


a 

b 

C 

d 

a 

3 

0 

0 

0 

b 

0 

2 

0 

0 

c 

0 

0 

2 

0 

d 

0 

0 

0 

1 

a 

b 

C 

d 

a 

1/3 

0 

0 

0 

b 

0 

1/2 

0 

0 

c 

0 

0 

1/2 

0 

d 

0 

0 

0 

1 

a 

b 

C 

d 

a 

0 

1 

1 

1 

b 

1 

0 

1 

0 

c 

1 

1 

0 

0 

d 

1 

0 

0 

0 

(d)  S  =  a-A-D"‘+(l-a)/n,a  =  0.85 


source 

a 

b 

c 

d 

destination 

a 

0.04 

0.46 

0.46 

0 .89 

1.85 

b 

0.32 

0.04 

0.46 

0.04 

0.86 

c 

0.32 

0.46 

0.04 

0.04 

0.86 

d 

0.32 

0.04 

0.04 

0.04 

0.43 

1 

I 

The  next  iteration  yields  the  PageRank  vectors  listed  in  Tables  29(e)  and  (f ).  The 
power  method  does  not  converge  to  four  decimal  places  until  the  20*  iteration,  as  shown 


in  the  PageRank  vector  listed  in  Table  29(h),  where 


lOg|i,(S)h0.6194  0-0001 


=  20. 


Table  29.  Paw  Graph’s  PageRank  Vector,  a  =  0.85 


(a)  Xq 


a 

0.25 

b 

0.25 

c 

0.25 

d 

0.25 

(b)  So 


a 

0 

b 

0 

c 

0 

d 

0 

(c)Si=Xo  (d)Xi=S-x, 


a 

0.25 

b 

0.25 

c 

0.25 

d 

0.25 

a 

0.4625 

b 

0.2146 

c 

0.2146 

d 

0 . 1083 

(e)  Sj  =  Xi 


(f)  Xj  =S-Xi 


(g) 


(h)  x^  =S-x^ 


a 

0.3120 

b 

0.2597 

c 

0.2597 

d 

0.1685 

a 

0.3667 

b 

0.2459 

c 

0.2459 

d 

0 . 1414 

a 

0.4625 

b 

0.2146 

c 

0.2146 

d 

0.1083 

a 

0.3667 

b 

0.2459 

c 

0.2459 

d 

0 . 1414 
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The  PageRank  vector  yielded  by  the  paw  graph  for  a  =  0.85  corresponds  to  the 
weighted  graph  shown  in  Figure  36(a).  Vertices  b  and  c  have  the  same  PageRank  value, 
0.2459,  which  is  illustrated  in  Figure  36(b)  using  a  shaded  overlay.  The  PageRank  values 
of  these  two  vertices  preclude  obtaining  a  canonical  vertex  order,  where  the  most  refined 
vertex  order  induced  by  this  PageRank  vector  is  illustrated  in  Figure  37(a). 

The  paw  graph’s  canonical  isomorph  produced  by  nauty  lists  vertex  b  before  c. 
Therefore,  the  tie  between  their  PageRank  values,  0.2459,  is  broken  by  sorting  on  their 
PageRank  values,  followed  by  their  order  in  the  canonical  isomorph.  Thus,  the  canonical 
isomorph  induces  the  canonical  vertex  order  shown  in  Figure  37(b).  However,  as  noted  in 
Section  2. 3.4.2,  determining  a  graph’s  canonical  isomorph  may  require  exponential  time. 
For  such  graphs,  a  non-canonical  vertex  order,  e.g.,  the  order  illustrated  in  Figure  37(a), 
may  be  the  best  that  can  be  obtained. 


(a)  PageRank  Values  (b)  PageRank  Partition 

Figure  36.  Paw  Graph’s  PageRank  Vector,  a  =  0.85 


(a)  PageRank  Order  (b)  Canonical  Order 

Figure  37.  Paw  Graph’s  PageRank  Ordering,  a  =  0.85 
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As  described  at  the  end  of  Section  2.5.1,  the  PageRank  vector  is  also  unique  up  to 


graph  isomorphism.  Thus,  a  vertex,  v.,  that  receives  the  PageRank  value,  x^,  receives  the 
same  PageRank  value  after  applying  an  arbitrary  permutation  matrix,  P,  i.e.. 


P  A  P  S  P^  P  X. 


(27) 


Thus,  two  vertices  contained  in  the  same  block  of  the  orbit  partition  must  yield 


equal  PageRank  values.  For  example,  the  paw  graph’s  orbit  partition  is 


where  each  block’s  vertices  yield  the  unique  PageRank  value,  [0.1414, 0.2459, 0.3667], 

respectively.  The  paw  graph’s  orbit  partition  is  the  same  as  its  coarsest  equitable  partition, 
up  to  a  block  permutation.  For  example,  applying  1-D  Weisfeiler-Lehman  stabilization  to 


the  paw  graph  yields  the  coarsest  equitable  partition,  [{<7} ,  {6,  c} ,  {a}] . 


The  orbit  partition  does  not  always  coincide  with  the  coarsest  equitable  partition. 
For  example,  the  cuneane  graph  yields  the  coarsest  equitable  partition  and  orbit  partition 
illustrated  in  Figures  38(a)  and  (b),  respectively.  Additionally,  each  vertex  yields  the  same 
PageRank  value,  0.1250.  Thus,  in  this  example,  applying  the  coarsest  equitable  partition 
suffices  to  identify  which  vertices  must  have  equal  PageRank  values. 


(a)  Coarsest  Equitable  Partition 


(b)  Orbit  Partition 


Figure  38.  Cuneane  Graph’s  Coarsest  Equitable  and  Orbit  Partitions 
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2.6.  Observations  about  Equitable  Vertices  and  PageRank  Values 

Vertices  contained  in  the  same  block  of  the  orbit  and  coarsest  equitable  partitions 

yield  equal  PageRank  values.  However,  finite-precision  arithmetic  limitations  may  cause 
vertices  contained  in  the  same  block  of  these  partitions  to  have  unequal  PageRank  values. 
The  algorithms  described  in  Chapters  4  and  5  ensure  vertices  contained  in  the  same  block 
of  these  partitions  have  equal  PageRank  values.  Moreover,  two  of  the  methods  reduce  the 
time  needed  to  find  the  PageRank  vector  if  the  coarsest  equitable  partition  is  non-discrete. 
Conversely,  none  of  the  algorithms  improve  the  PageRank  algorithm’s  performance  if  the 
graph’s  coarsest  equitable  partition  is  discrete. 

For  example,  the  house  graph’s  coarsest  equitable  partition,  {a},  {Z),e}], 

which  is  non-discrete,  is  illustrated  in  Figure  39(a).  The  PageRank  values  yielded  by  the 
block’s  vertices  are  [0.172,0.168,0.244],  respectively.  The  most  non-discrete  coarsest 

equitable  partition  occurs  if  all  vertices  are  contained  in  one  block,  i.e.,  the  unit  partition. 
Such  partitions  are  yielded  by  all  ^-regular  graphs,  in  which  every  vertex  has  k  neighbors. 
For  example,  the  coarsest  equitable  partition  of  the  4-regular  octahedron  graph  shown  in 

Figure  39(b)  is  [{n,Z),c,<7,e,/}]  and  each  vertex  has  the  PageRank  value,  1/6  =  0.16. 


(a)  House  Graph,  [|c,(i},  {a},  {&,e}]  (b)  Octahedron  Graph,  [{a,h,c,(i,e,/}] 

Figure  39.  Coarsest  Equitable  Partitions  of  the  House  and  Octahedron  Graphs 
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Since  the  coarsest  equitable  partition  of  a  ^-regular  graph  contains  one  block,  such 
graphs  yields  the  maximum  performance  gain  with  respect  to  the  algorithms  described  in 
Chapters  4  and  5.  Although  k-regular  graphs  are  trivial  for  the  PageRank  algorithm,  since 
each  vertex  has  the  same  PageRank  value,  they  have  many  important  uses.  For  instance, 
k-regular  graphs  are  used  to  assess  applications  such  as  nauty,  since  finding  the  canonical 
isomorph  of  some  k-regular  graphs  may  cause  such  applications  to  need  exponential  time. 

Some  more  interesting  graph  families  with  respect  to  the  PageRank  algorithm  and 
the  results  described  in  Chapters  3-5  are  trees  and  grid  graphs.  For  instance,  many  trees 
and  grid  graphs  often  yield  a  coarsest  equitable  partition  containing  blocks  composed  of 
multiple  vertices.  For  example,  the  9-vertex  random  tree  shown  in  Figure  40(a)  yields  the 

coarsest  equitable  partition,  [|a, c, g, /},{&,/?}, {J,/},{e}].  The  3x3  grid  graph  shown 
in  Figure  40(b)  yields  a  non-discrete  coarsest  equitable  partition  containing  three  blocks, 
[{a,c,g,i},{b,d,f,h],{e]~\. 


Figure  40.  Two  Graphs  Yielding  a  Non-Discrete  Coarsest  Equitable  Partition 
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2.7.  Known  Results 

The  PageRank  algorithm  is  one  of  many  methods  that  use  eigenvector  centrality 
to  determine  relative  vertex  importance.  The  HITS  algorithm  orders  responses  using  the 
dominant  eigenvector  of  two  matrices  [LaM06].  Recent  social  network  research  relaxes 
an  equitable  partition’s  definition  to  determine  if  results  similar  to  those  described  herein 
can  be  obtained  on  larger  graphs  [BrL04,  Ler05].  The  mansion  graph  has  also  been  used 
to  define  an  “almost  equitable  partition”  and  its  relationship  to  the  Laplacian  eigenvectors 
of  a  graph  (cf  Figure  1  and  Section  2.4)  [CDR07]. 

The  work  of  Boldi  et  al.  is  most  directly  related  to  the  results  described  herein, 
since  they  first  showed  that  vertices  contained  in  the  same  block  of  the  coarsest  equitable 
partition  have  equal  PageRank  values  [BLS+06].  The  earlier  proof  uses  tools  drawn  from 
category  theory,  the  minimum  base  and  its  associated  fibrations,  which  correspond  to  the 
coarsest  equitable  partition  and  its  associated  blocks,  respectively.  That  proof  establishes 
a  Markov  chain’s  limit  distribution  is  constant  within  fibrations,  i.e.,  the  PageRank  vector 
is  constant  within  blocks. 

Boldi  et  al.  also  show,  in  the  proof  accompanying  their  Theorem  9  [BLS+06],  that 
the  PageRank  vector  can  be  lifted  from  the  quotient  matrix  using  techniques  described  in 
Section  2.3.5.  They  suggest  applying  this  theorem  would  reduce  the  time  required  to  find 
PageRank  vectors,  but  do  not  define  such  a  method  or  analyze  its  performance.  However, 
they  do  construct  an  algorithm  for  finding  the  coarsest  equitable  partition  that  minimizes 
memory  usage.  Their  last  result  describes  graphs  based  on  European  web  pages  that  yield 
a  non-discrete  coarsest  equitable  partition.  That  result  suggests  the  PageRank  algorithm’s 
performance  can  be  improved  on  at  least  some  web  graphs. 
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2.8.  Summary 

Section  2. 1  introduced  an  application  of  the  PageRank  algorithm  to  UAV  swarms, 
determining  which  nodes  are  the  most  beneficial  for  injecting  misinformation  assumed  to 
be  randomly  propagated  in  the  network.  That  section  also  explored  similar  applications  in 
social  networks,  e.g.,  finding  members  that  facilitate  spreading  rumor  or  diseases. 

The  remainder  of  Chapter  2  defines  tools  applied  in  Chapters  3-5  to  improve  the 
PageRank  algorithm’s  performance  if  two  or  more  nodes  yield  equal  PageRank  values. 
Section  2.2  explores  deciding  graph  isomorphism,  where  graphs  are  said  to  be  isomorphs 
if  they  define  equivalent  edges  up  to  a  vertex  permutation.  Section  2.3  defines  key  vertex 
partitions  commonly  used  to  help  decide  graph  isomorphism,  such  as  equitable  partitions. 

The  coarsest  equitable  partition  is  the  most  refined  partition  found  if  every  vertex 
is  placed  in  one  block  and  the  only  operations  used  are  the  sorting  and  comparison  of  any 
adjacent  vertex  labels,  e.g.,  as  done  in  1-D  Weisfeiler-Lehman  stabilization.  The  quotient 
graph  (matrix)  induced  by  a  partition  is  defined  in  Section  2.3.5.  The  eigenvector  of  a  key 
quotient  matrix  is  used  to  obtain  the  most  notable  results  herein,  as  shown  in  Chapter  5. 

Section  2.4  describes  some  notable  results  that  apply  a  graph’s  eigenvalues  and  its 
eigenvectors.  Section  2.5  defines  the  PageRank  algorithm  that  orders  vertices  by  applying 
a  certain  eigenvector.  The  algorithm  ensures  the  eigenvector  exists  by  first  perturbing  the 
adjacency  matrix  to  obtain  a  positive  stochastic  matrix  that  defines  a  Markov  chain.  The 
dominant  eigenvector  of  that  matrix  represents  the  Markov  chain’s  stationary  distribution. 
The  eigenvector  can  be  obtained  using  the  power  method,  an  iterated  dot  product  process. 
The  results  described  in  Chapters  3-5  improve  the  PageRank  algorithm’s  performance  by 
decreasing  the  number  and  size  of  the  dot  products  computed  by  the  power  method. 
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III.  Establishing  Equitable  Equivalency 


3.1.  Overview 

A  practical  lower  bound  on  the  PageRank  algorithm’s  execution  time  is  developed 
in  Section  3.2.  This  result  is  derived  by  applying  two  assumptions  related  to  the  scaling 
value,  a,  and  required  precision,  r.  Combining  these  assumptions  with  recent  results  that 
determined  the  upper  bound  on  the  PageRank  algorithm’s  complexity  [HaK03]  yields  the 
lower  bound.  The  existing  upper  bound  and  the  new  practical  lower  bound  are  essentially 
derived  by  applying  a  known  bound  on  the  number  of  power  method  iterations  [G0V88], 
which  is  based  on  the  two  dominant  eigenvalues  of  the  PageRank  matrix,  S. 

The  material  in  Section  3.3  highlights  the  similarity  between  the  dot  product  and 
1-D  Weisfeiler-Lehman  stabilization.  In  particular,  the  coarsest  equitable  partition  yielded 
by  applying  1-D  Weisfeiler-Lehman  stabilization  is  identical  to  the  partition  yielded  by  a 
process  based  on  the  dot  products,  as  illustrated  by  the  example  provided  in  Section  3.3.1. 
This  method  of  finding  the  coarsest  equitable  partition  is  listed  in  Section  3.3.2  and  has 
the  same  complexity  bounds  as  1-D  Weisfeiler-Lehman  stabilization.  Its  key  contribution 
is  to  highlight  the  relationship  between  the  coarsest  equitable  partition  and  dot  product. 

More  precisely,  the  proof  constructed  in  Section  3.4.1  shows  vertices  contained  in 
the  same  block  of  the  coarsest  equitable  partition  must  yield  equal  iterated  dot  products. 
Such  vertices  also  must  have  equal  PageRank  values,  as  established  in  Section  3.4.2  and 
considered  further  in  Section  3.4.3.  This  relationship  between  a  graph’s  coarsest  equitable 
partition  and  its  PageRank  values  was  previously  and  independently  shown  after  applying 
different  techniques  [BLS+06].  The  relationship’s  potential  impact  on  the  execution  time 
needed  to  compute  the  dot  product  and  PageRank  vectors  is  explored  in  Section  3.4.4. 
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3.2.  Lower  Bound  on  the  Expected  Number  of  Power  Method  Iterations 

Before  exploring  the  relationship  between  the  coarsest  equitable  partition  and  the 

PageRank  vector,  it  is  useful  to  determine  a  lower  bound  on  the  number  of  power  method 
iterations  needed  to  obtain  sufficient  numerical  precision  in  the  PageRank  vector,  x.  The 
practical  upper  bound  on  the  number  of  iterations,  denoted  t,  is  (cf  Section  2. 5.2. 2) 

The  PageRank  algorithm’s  developers  report  log2  n  iterations  often  suffices,  and 

this  behavior  has  been  reported  by  other  researchers  [PBM+98,  ANT+02].  However,  no 
theoretical  derivations  about  fs  lower  bound  are  known.  Applying  three  key  assumptions 
does  yield  a  practical  lower  bound  on  the  number  of  power  method  iterations. 

The  key  assumption  is  or  >  0.5,  where  the  range,  a  e  [0.5,  l.O],  also  includes  the 
default  scaling  factor,  a  =  0.85.  The  second  assumption  is  b  =  2,  where  the  upper  bound, 
logj  r/log^  or,  is  independent  of  any  base,  b.  The  third  assumption  is  r  <\ln,  the  largest 


tolerance  that  can  potentially  yield  n  distinct  PageRank  values. 

Theorem  1  Assuming  a  >  0.5  and  b  =  2,  the  practical  lower  bound  on 
the  number  of  power  method  iterations,  t,  to  ensure  r  <  l/«  is  logj  n. 

Proof  Assuming  a  >  0.5,  b  =  2,  and  t  <\ln,  substitution  yields 


log^  r  _  log;  (1/n)  _  log;  (1/n) 
log^a  log;  (0.5)  -1 


=  -log;  (l/«)  =  log;  «  <  t  .  ■ 


Thus,  combining  the  existing  upper  bound  and  the  new  practical  lower  bound,  the 


number  of  power  method  iterations,  t,  needed  to  compute  the  PageRank  vector,  x,  are 


log;«<t<log|;^^^^|r<log„r. 


(29) 
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3.3.  Motivating  Equitable  Dot  Products  and  PageRank  Values 

The  fundamental  objective  of  this  section  is  to  motivate  the  techniques  applied  in 

the  proofs  constructed  in  Section  3.4.  The  proof  of  Theorem  2  shows  vertices  contained 
in  the  same  block  of  a  graph’s  coarsest  equitable  partition  must  yield  equal  dot  products. 
The  proof  of  Theorem  3  shows  vertices  contained  in  the  same  block  of  a  graph’s  coarsest 
equitable  partition  must  yield  equal  PageRank  values.  Hence,  such  vertices  are  equitable 
with  respect  to  iterated  dot  products  and  PageRank  values. 

There  are  several  equivalent  methods  of  finding  the  coarsest  equitable  partition, 
where  each  method  yields  the  same  coarsest  equitable  partition  up  to  a  block  permutation. 
For  instance,  three  methods  of  computing  the  coarsest  equitable  partition  are  described  in 
Section  2.3.3.  However,  neither  the  method  based  on  the  partition’s  formal  definition  that 
is  described  in  Section  2.3.3. 1,  nor  the  most  efficient  method  known  of  determining  the 
partition  described  in  Section  2. 3. 3. 2  implicitly  suggest  a  matrix  dot  product  equivalency. 
Fortunately,  1-D  Weisfeiler-Lehman  stabilization,  as  described  in  Section  2. 3. 3. 3,  yields 
such  an  equivalency.  Moreover,  that  method  yields  a  parallel  processing  implementation 
of  computing  the  coarsest  equitable  partition,  since  rows  can  be  sorted  independently. 

The  link  between  the  coarsest  equitable  partition  and  dot  product  is  predicated  on 
observing  1-D  Weisfeiler-Lehman  stabilization  sorts  each  matrix  row  and  the  dot  product 
multiplies  each  matrix  row  by  a  vector.  Appropriately  substituting  prime  numbers  yields  a 
method  that  uses  a  modified  dot  product  to  perform  1-D  Weisfeiler-Lehman  stabilization. 
The  method  is  described  by  example  in  Section  3.3.1  and  formally  listed  in  Section  3.3.2. 
The  algorithm’s  key  contribution  is  to  motivate  the  proofs  given  in  Section  3.4  that  show 
vertices  in  the  same  block  must  have  equal  dot  products  and  PageRank  values. 
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3.3.1.  From  Weisfeiler-Lehman  Stabilization  to  Iterated  Dot  Products 

This  section  contains  an  example  illustrating  how  to  obtain  the  coarsest  equitable 

partition  using  dot  products.  For  example,  the  house  graph’s  coarsest  equitable  partition  is 
[{c,  J},  {a},  if  computed  using  1-D  Weisfeiler-Lehman  stabilization.  Iterated  dot 


products  can  be  modified  to  yield  the  same  partition. 

First,  multiplying  the  house  graph’s  adjacency  matrix  listed  in  Table  30(a)  with 
the  ones  vector  listed  in  Table  30(b)  yields  the  dot  product  vector  listed  in  Table  30(c). 
Multiplying  the  house  graph’s  adjacency  matrix  with  that  vector  yields  the  vector  listed  in 
Table  30(d),  where  the  entries  associated  with  vertices  b  and  e  equal  7.  Closer  inspection 
reveals  the  corresponding  summed  intermediate  products,  l-2-i-0-3-i-l-2-i-0-2-i-l-3  and 
1-2-I-1-3-I-0-2-I-1-2-I-0-3,  respectively.  Sorting  the  summed  intermediate  products  yields 
the  same  sorted  intermediate  products,  0-2-i-0-3-i-l-2-i-l-2-i-l-3. 

Similarly,  multiplying  the  house  graph’s  adjacency  matrix  with  the  vector  listed  in 
Table  30(d)  yields  the  vector  listed  in  Table  30(e),  where  vertices  b  and  e  yield  the  same 
sorted  products,  0-5-l-0-7-l-l-5-l-l-6-l-l-7  =  18,  i.e.,  no  further  vertex  refinement  occurs. 

Thus,  the  house  graph’s  coarsest  equitable  partition  appears  to  be  [{c,t/},  {a}, 


Table  30.  Iterated  Dot  Products  of  the  House  Graph’s  Adjacency  Matrix 
(a)  A  (b)  Xq  =  1^’'  (c)  Xj  =  A  •  Xo  (d)  Xj  =  A  •  Xj  (e)  Xj  =  A  •  X2 


a 

b 

C 

d 

e 

a 

0 

1 

0 

0 

1 

b 

1 

0 

1 

0 

1 

c 

0 

1 

0 

1 

0 

d 

0 

0 

1 

0 

1 

e 

1 

1 

0 

1 

0 

a 

1 

b 

1 

c 

1 

d 

1 

e 

1 

a 

14 

b 

18 

c 

12 

d 

12 

e 

18 

a 

6 

b 

7 

c 

5 

d 

5 

e 

7 

a 

2 

b 

3 

c 

2 

d 

2 

e 

3 
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The  vector  shown  in  Tables  30(d)  and  (e)  coincide  with  the  house  graph’s  coarsest 
equitable  partition,  {a},  Vertices  b  and  e  yield  18,  where  vertices  yielding 

the  same  dot  product  are  contained  in  the  same  block  of  the  coarsest  equitable  partition. 
However,  it  is  erroneous  to  conclude  iterated  dot  products  of  a  graph’s  adjacency  matrix 
must  coincide  with  the  coarsest  equitable  partition.  In  fact,  a  counter-example  is  obtained 
in  the  next  step  in  the  example,  which  is  based  on  prime  number  substitution. 

Before  proceeding,  it  is  worth  noting  the  key  goal  is  to  show  vertices  contained  in 
the  same  block  of  the  coarsest  equitable  partition  yield  equal  dot  products.  The  example 
motivates  this  result  by  suitably  modifying  the  iterated  dot  product  process  to  show  that 
the  dot  product  could  be  used,  albeit  inefficiently,  to  find  the  coarsest  equitable  partition. 

The  first  change  substitutes  all  dot  product  entries  with  prime  numbers.  This  step 
eliminates  a  key  problem  in  finding  the  coarsest  equitable  partition  using  the  dot  product, 
namely,  that  two  sets  may  yield  equal  products  although  each  set  contains  distinct  values. 
For  example,  the  distinct  sets,  {6,6}  and  {4,9},  yield  the  equal  product,  6-6  =  4-9  =  36. 
However,  appropriately  substituting  distinct  prime  numbers  often  suffices  to  distinguish 
such  sets,  e.g.,  applying  the  substitution,  [4,6,9]  [2,3,5],  yields  3-3  2-5. 

For  instance,  substituting  the  ‘  1  ’s  vector  listed  in  Table  3 1(a)  with  a  seed  vector  of 
‘2’s  yields  the  seed  vector  listed  in  Table  31(b).  Multiplying  the  house  graph’s  adjacency 
matrix  with  the  ‘2’s  seed  vector  yields  the  vector  listed  in  Table  31(c).  Replacing  the  ‘4’s 
with  ‘2’s  and  ‘6’s  with  ‘3’s  yields  the  vector  shown  in  Table  31(d).  Iterating  that  process 
yields  the  vectors  shown  in  Tables  31(e)-(l),  where  the  stabilization  cycle  is  detected  by 
comparing  Tables  31(f)-(h)  with  Tables  31(j)-(l). 
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The  vector  listed  in  Table  31(f),  [3, 5, 2, 2, 5],  corresponds  with  the  house  graph’s 

coarsest  equitable  partition,  {«},  However,  the  product,  [lO,  10, 7, 7,  lO], 

listed  in  Table  31(g)  does  not  also  correspond  with  the  house  graph’s  coarsest  equitable 
partition.  Thus,  applying  prime  number  substitution  to  each  vector  fails  to  yield  a  process 
equivalent  to  performing  1-D  Wei  sfei  1  er-Lehman  stabilization.  More  precisely,  applying 
prime  number  to  each  vector  resolves  equal  product  issues,  e.g.,  6-6  =  4-  9— >^3-3?i2-5. 


but  does  not  resolve  equal  summed  prime  products,  e.g.,  2-3  +  2-  7  =  2-  5  +  2-  5  =  20. 


Table  3 1 .  Iterated  Prime  Dot  Products  of  the  House  Graph’s  Adjacency  Matrix 


(b)  Xj  =  getPrimes(xQ)  (c)  Xj  =  A-Xj  (d)  Xj  =  getPrimes(x2) 

a  I  4 
b  6 
c  4 

d _ 4 

e  6 

(f)  Xj  =  getPrimes(x4)  (g)  Xg  =  A-Xj  (h)  x^  =  getPrimes(xg) 

a  I  10 
b  10 

c  7 

d _ 7 

e  10 

(j)  x,  =getPrimes(xg)  (k)  Xj,,  =  A-x,  (1)  Xjj  =getPrimes(xjo) 

a  I  10 
b  10 

c  7 

d  7 

e  10 
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Entry  uniqueness  is  improved  by  multiplying  rows  and  eolumns  in  the  adjaeency 
matrix  with  the  prime  number  veetor  yielded  after  the  most  reeent  iteration.  This  ehange 
first  eonstruets  the  temporary  matrix,  T  =  D-A-D,  where  D  =  diag(x.),  and  substitutes 

=T  X;  for  X;^j  =  A  X;.  For  example,  the  house  graph’s  initial  iteration  is  shown  in 

Table  32.  Subsequently  multiplying  the  house  graph’s  adjaeency  matrix  with  the  diagonal 
matrix,  D,  listed  in  Table  32(d),  yields  the  temporary  matrix,  T,  listed  in  Table  33(a). 

The  next  iteration  yields  the  vector  listed  in  Table  33(c),  where  the  resulting  prime 
numbers  precisely  correspond  to  the  house  graph’s  coarsest  equitable  partition  yielded  by 

1-D  Weisfeiler-Lehman  stabilization,  [{c,<i},  {a},  Additional  iterations  yield  the 

same  prime  numbers,  hence,  the  dot  product  iteration  process  has  stabilized. 

Table  32.  Constructing  the  First  Prime  Diagonal  Matrix 
(a)  Xq  (b)Xj=A-XQ  (c)  X2  =  getPrimes(xj)  (d)  D  =  diag(x2) 


a 

b 

c 

d 

e 

a 

2 

0 

0 

0 

0 

b 

0 

3 

0 

0 

0 

c 

0 

0 

2 

0 

0 

d 

0 

0 

0 

2 

0 

e 

0 

0 

0 

0 

3 

a 

2 

b 

3 

c 

2 

d 

2 

e 

3 

Table  33.  First  Prime  Dot  Product  Iteration 
(a)  T  =  D-A-D  (b)  Xj  (c)x3=Z-X2  (d)  x^  =getPrimes(x3) 


a 

b 

c 

d 

e 

a 

0 

6 

0 

0 

6 

b 

6 

0 

6 

0 

9 

c 

0 

6 

0 

4 

0 

d 

0 

0 

4 

0 

6 

e 

6 

9 

0 

6 

0 

a 

2 

b 

3 

c 

2 

d 

2 

e 

3 

a 

84 

b 

129 

c 

62 

d 

62 

e 

129 

a 

3 

b 

5 

c 

2 

d 

2 

e 

5 
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The  last  step  needed  to  illustrate  vertices  contained  in  the  same  block  yield  equal 
dot  products  is  motivated  by  the  impact  of  prime  numbers  on  uniqueness.  For  example,  if 
the  vectors,  x,  y,  and  z,  are  composed  of  prime  numbers,  the  intermediate  multiplications 
contained  in  the  dot  products,  x  •  y  and  x  •  z,  are  distinct,  but  the  summed  multiplications, 
i.e.,  the  dot  products,  may  be  equal.  Various  methods  can  be  used  to  resolve  this  issue  by 
appropriately  replacing  the  dot  product  summation,  e.g.,  by  sorting,  and  not  summing,  the 
intermediate  multiplications  contained  in  each  dot  product. 

For  example,  the  vectors  listed  in  Table  34  yield  distinct  intermediate  products, 
but  happen  to  have  an  equal  summed  value,  i.e.,  x-z^=106  =  5-  2  +  5-  3  +  5-  5  +  3-  7  +  5-  7 
and  y-z^  =106  =  7-  2  +  7-  3  +  3-  5  +  3-  7  +  5-  7.  However,  comparing  sorted  product  pairs 
suffices  to  distinguish  these  dot  products.  For  example,  the  products,  x-z^  and  y  z^, 
yield  [5-2,  5-3,  5-5,  3-7,  5-7]  and  [7 -2,  7 -3, 3-5, 3-7,  5 -7],  respectively,  corresponding 
to  [lO,  15,  25,  21,  35]  and  [14,  21, 15, 21,  35].  Finally,  sorting  the  intermediate  products  in 


ascending  order  yields  [10,15,21,25,35]  and  [14,15,21,21,35],  respectively. 


Since  the  sorted  intermediate  products  are  distinct,  if  x  and  y  are  assumed  to  be  in 
rows  of  an  arbitrary  matrix  and  z  is  the  vector  obtained  after  the  last  dot  product  iteration, 
X  and  y  cannot  be  located  in  the  same  block  of  the  coarsest  equitable  partition.  Therefore, 
the  dot  product  can  be  modified  to  obtain  the  same  coarsest  equitable  partition  yielded  by 
1-D  Weisfeiler-Lehman  stabilization. 

Table  34.  Two  Equal  Dot  Products,  x  •  z^  =  y  •  z^ 

(a)  X  (b)  y  (c)  z 


LO 

LO 

5 

3 

LO 

7 

7 

3 

3 

5 

2 

3 

LO 

7 

7 
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3.3.2.  Finding  the  Coarsest  Equitable  Partition  By  Iterated  Dot  Products 

The  algorithm  listed  in  Figure  41  determines  a  graph’s  coarsest  equitable  partition 

by  similarly  modifying  the  dot  product.  Initial  values  are  assigned  on  lines  1-4,  where  the 
entries  in  the  vector,  x,  equal  the  prime  number,  ‘2’,  and  the  partition  archive  vector,  s,  is 
set  equal  to  ‘O’.  The  partition  archive  vector,  s,  stores  the  values  contained  in  x  obtained 
at  the  end  of  the  most  recent  partition  stabilization  iteration. 

The  stabilization  iteration  occurs  on  lines  6-26,  where  the  loop  terminates  if  the 
partition  is  identical  after  two  consecutive  iterations.  The  current  vertex  partition  is  saved 
for  future  comparison  on  line  9,  i.e.,  s  =  x.  The  diagonal  prime  number  matrix  based  on 
the  current  partition  is  constructed  on  line  1 1  and  applied  to  the  rows  and  columns  of  the 
adjacency  matrix.  A,  on  line  13.  The  nested  loops  on  lines  14-21  implement  the  last  step, 
where  the  dot  product,  x  =  A  •  x,  is  replaced  by  the  sorting  of  the  summed  multiplications 
defined  by  each  of  the  n  dot  products.  The  n  multiplications  defined  by  the  n  dot  products 
are  computed  on  lines  16-18  and  lexicographically  sorted  on  line  20. 

Each  set  of  identical  rows  of  sorted  multiplications  is  assigned  a  unique  identifier 
on  line  20.  These  identifiers  are  matched  to  a  similarly  unique  prime  number  on  line  21. 
The  previous  and  updated  partitions  are  compared  on  line  6,  where  if  the  block  identifiers 
entries  are  equal,  i.e.,  if  s  =  x,  the  vertex  partition  has  stabilized  and  the  main  loop  can  be 
terminated.  This  alternative  method  is  equivalent  to  1-D  Weisfeiler-Lehman  stabilization 
and  yields  the  same  upper  bound  on  execution  time,  0{n^  ■  log  n  ■  log  n^),  or  equivalently, 

0{n^  Tog^  n).  That  upper  bound  can  be  reduced  to  0(j -n-log^  n)  if  the  graph  is  stored 
using  adjacency  lists,  where  d  =  max  (deg  (v,.)),  v.  e  V,  as  described  in  Section  2.3. 3.3. 
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1.  findCoarsestPartition(A, «) 

2.  #  create  prime  partition  vector 

3.  x  =  2-l"’' 

4.  #  create  archive  partition  vector 

5.  s  =  0"’‘ 

6.  #  compute  the  coarsest  equitable  partition 

7.  while  (s  ^  x) 

8.  #  archive  most  recent  partition 

9.  s  =  X 


10. 

#  construct  diagonal  matrix 

11. 

D  =  diag(x) 

12. 

#  construct  intermediate  prime  substitution  matrix 

13. 

T  =  D  A 

D 

14. 

for  i  from  1  to  n 

15. 

#  compute  pair-wise  product 

16. 

for  j  from  \ton 

17. 

- 

18. 

end  for 

19. 

#  sort  row  of  pair-wise  products 

20. 

Z;.  =  sort(^Z;.) 

21. 

end  for 

22. 

#  find  unique  lexicographically  sorted  rows 

23. 

X  =  getldenticalRowIdentifiers(Z) 

24. 

#  substitute  primes  for  lexicographieally  unique  rows 

25. 

X  =  getUniquePrimes(x) 

26. 

end  while 

27. 

#  return  equitable  partition 

28. 

return  B  = 

^i=\  9  ^2  ’  •  •  •  ^max(x) 

29. 

end  findCoarsestPartition 

Figure  41.  1-D  Weisfeiler-Lehman  Stabilization  Using  Primes  and  Dot  Products 
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3.4.  Relating  Equitable  Dot  Products  and  PageRank  Values 
3.4.1.  Equitable  Dot  Products 

Given  an  arbitrary  graph,  G,  and  its  associated  adjacency  matrix,  A,  the  coarsest 

equitable  partition,  B,  can  be  computed  by  applying  Weisfeiler-Lehman  stabilization.  The 

method  listed  in  Figure  42  is  a  matrix-based  algorithm  for  performing  Weisfeiler-Lehman 

stabilization  (cf  Figure  18  in  Section  2.3.3.3). 

Theorem  2  Vertices  contained  in  the  same  block  of  the  coarsest  equitable 
partition  have  equal  iterated  dot  products,  =  A  •  x^,  assuming  the  initial 

vector’s  entries  are  equal,  i.e.,  x^  (/)  =  x^  ( y), 

Proof  Weisfeiler-Lehman  stabilization  sorts  adjacent  labels,  whereas  the 
dot  product  multiplies  each  row’s  values  by  some  vector.  Replacing  the 
adjacent  label  sorting  step  on  line  6  with  Z  =  A-x  facilitates  establishing 
v^,  e  b.  -^2j^  =  Zj,  since  a  contradiction  is  obtained  if  the  implication  is 

false,  i.e.,  if  v^,  e  ^  =  Z^. 

Assume  that  the  dot  products,  Z^  =  A^  j  •  Xj  +  A^  2  ‘  ^2  - ^  ^rn''^n 

and  Z^  =  A^  j  •  Xj  -h  A^  2  ■  ^2  equal,  i.e.,  that  Z^  ^ 

Then,  sorting  the  adjacent  labels,  [x,. :  {v^,v.}  e  E~^  and  x^. :  e  £'j, 

should  have  showed  the  vertices  are  contained  in  different  blocks,  i.e.,  that 
V,  e  h  and  v  e  h,.,  /.  Otherwise,  it  must  be  the  case  that  Z  =  Z  .  ■ 

r  I  d  J  ^  ^  r  S 


1.  fmdCoarsestPartition(A, «) 


2. 

x=r’' 

3. 

s  =  0"’' 

4. 

while  (s  ^  x) 

5. 

s  =  X 

6. 

sort(x,  :A^_^=l) 

1  Ql,n-deg(v,)-l 

7 .  X  =  getUniqueRo  widentifiers  ( Z ) 

8.  end  while 


9.  return  B  =  \b.^, ] ,  x^  =  /  ^  e b. 

10.  end  findCoarsestPartition 

Figure  42.  1-D  Weisfeiler-Lehman  Stabilization  Using  Matrices 
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More  explicitly,  during  each  iteration,  each  row  contained  in  the  label  matrix,  Z, 
corresponds  to  the  sorted  labels  of  every  adjacent  vertex,  i.e., 

Z„„=[  X,  I  sort(x,:A,„=l)  |  o'-*'!'-'-' ] .  (30) 

Thus,  the  r*  row  contains  the  current  label,  ,  given  to  vertex  r,  followed  by  the  sorted 
labels  of  its  adjacent  neighbors,  min(x^^j),...,max(x^^^),  such  that  Finally, 

the  row,  .,  is  padded  with  n-deg(v^)-l  zero  entries  to  ensure  Z  is  a  square  matrix. 
At  the  end  of  this  iteration,  x^  =  x^  if  and  only  if  Z^ .  =Zj^.,  V/. 

Replacing  (30)  with  the  dot  product,  i.e.,  Z  =  A  x,  induces  the  computation  of  n 

dot  products,  where  Z^  =  A^  j  -Xj  +  A^  j  ■^2  - ''^n-  However,  A  is  a  graph’s  {0,1} 

adjacency  matrix,  thus,  Z^  =  ^  x^ :  ^  =  1,  where  ^  =  1  <-^  { }  e  £■.  Therefore,  the 

dot  product  is  simply  the  sum  of  the  vertex  labels  adjacent  to  v^,  which  for  some  graphs, 

suffices  to  construct  the  coarsest  equitable  partition.  Thus,  vertices  contained  in  the  same 
block  of  a  graph’s  coarsest  equitable  partition  yield  equal  iterated  dot  products,  assuming 
the  seed  vector’s  entries  are  equal.  Thus,  if  x  =  c"’',  where  c  is  some  constant,  iterating 
Z  =  A-x  and  x  =  getUniqueRowIdentifiers(Z)  yields  Z^  =Z^  if  v^,v^  eb.. 

Vertices  contained  in  different  blocks  may  yield  equal  dot  products,  i.e.,  given  two 
vertices,  and  v^,  such  that  a  ebj,  it  is  possible  that  Z^  =Z^.  Therefore,  the 

converse  is  false,  i.e.,  Z^  =  Z^  ^  v^,v^  e:b..  Applying  similar  logic  shows  the  inverse  is 
also  false,  i.e.,  if  i  ^  j,  such  that  eh.  a  &  bj  ^  Z^  Z^.  Finally,  the  contrapositive 
is  true,  i.e.,  ^  Z,  b,  a  v  e  b,,  such  that  i  ^  /. 

r  s  r  i  s  j  ■'  ’J 
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3.4.2.  Equitable  PageRank  Values 

The  relationship  between  the  eoarsest  equitable  partition  and  the  dot  product  can 

be  exploited  to  improve  the  PageRank  algorithm’s  performance.  The  PageRank  algorithm 

perturbs  an  adjacency  matrix  to  obtain  a  strictly  positive  stochastic  matrix.  The  PageRank 

algorithm  then  uses  the  power  method  to  find  the  normalized  dominant  eigenvector  of  the 

that  perturbed  matrix,  where  the  PageRank  vector’s  entries  are  unique  up  to  isomorphism. 

This  relationship  is  leveraged  in  Chapters  4  and  5  to  improve  the  PageRank  algorithm’s 

performance  on  graphs  yielding  a  non-discrete  coarsest  equitable  partition,  i.e.,  partitions 

containing  one  or  more  blocks  composed  of  multiple  vertices. 

Theorem  3  Vertices  contained  in  the  same  block  of  the  coarsest  equitable 
partition  must  have  equal  PageRank  values,  i.e.,  v^,  e 

Proof  Assume  the  initial  PageRank  vector  is  the  normalized  ones  vector, 

X  =  1"’Y«.  and  the  PageRank  vector  is  computed  using  the  power  method, 
which  computes  a  normalized  iterated  dot  product.  Applying  Theorem  2 
suffices  to  establish  vertices  contained  in  the  same  block  of  the  coarsest 
equitable  partition  must  have  equal  PageRank  values. 

Identical  results  also  hold  with  respect  to  the  converse,  inverse,  and 
contrapositive.  For  instance,  vertices  contained  in  different  blocks  of  the 
coarsest  equitable  partition  may  have  the  same  PageRank  value.  However, 

vertices  having  different  PageRank  values  must  be  in  different  blocks.  ■ 

Boldi  et  al.  first  showed  that  vertices  contained  in  the  same  block  of  the  coarsest 
equitable  partition  must  also  have  equal  PageRank  values  [BLS+06].  Their  proof  is  based 
on  category  theory,  namely,  the  minimum  base  and  its  fibrations,  which  correspond  to  the 
coarsest  equitable  partition  and  its  blocks,  respectively.  They  show  that  a  Markov  chain’s 
stationary  distribution  is  constant  within  each  fibration.  Similarly,  the  PageRank  value  is 
constant  within  each  block.  Their  work  and  Theorem  3  are  only  sufficient,  i.e.,  no  claims 
are  made  in  either  proof  about  the  PageRank  values  of  vertices  in  different  blocks. 


84 


3.4.3.  Additional  Equitable  Relationships 

Notably,  dot  product  iteration  may  yield  a  less  refined  partition  than  the  coarsest 

equitable  partition.  Conversely,  vertices  contained  in  the  same  block  of  a  graph’s  coarsest 
equitable  partition  yield  equal  dot  products,  assuming  the  first  dot  product  iteration  uses  a 
constant  vector,  e.g.,  the  all-ones  seed  vector.  Additionally,  vertices  contained  in  the  same 
block  of  the  orbit  partition  yield  equal  dot  product  values,  since  vertices  contained  in  the 
same  orbit  are  necessarily  contained  in  the  same  block  of  the  coarsest  equitable  partition. 

An  orbit  partition  may  be  more  refined  than  the  coarsest  equitable  partition,  i.e., 
contain  more  blocks.  Since  the  coarsest  equitable  partition  may  be  less  refined  than  the 
graph’s  orbit  partition,  it  may  reveal  more  vertices  that  have  equal  PageRank  values  and 
is  similarly  more  useful  for  improving  the  PageRank  algorithm’s  performance.  Moreover, 
the  coarsest  equitable  partition  can  be  obtained  in  deterministic  polynomial  time,  whereas 
computing  the  orbit  partition  may  require  exponential  time  (cf  Sections  2.3.3  and  2.3.4). 

For  example,  the  12-vertex  graph  illustrated  in  Figure  43  yields  an  orbit  partition 
containing  three  4-vertex  blocks  and  a  coarsest  equitable  partition  containing  an  8-vertex 
block  and  a  4-vertex  block.  Hence,  more  performance  gains  can  be  obtained  by  applying 
the  coarsest  equitable  partition,  since  it  only  contains  two  blocks.  In  particular,  using  the 
coarsest  equitable  partition  establishes  eight  vertices  have  equal  PageRank  values  and  the 
other  four  vertices  also  have  equal  PageRank  values,  for  any  scaling  value,  a. 


Figure  43.  Graph  Yielding  Different  Coarsest  Equitable  and  Orbit  Partitions 
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3.4.4.  Complexity  Analysis 

A  matrix  is  assumed  to  be  a  dense,  or  non-sparse,  matrix,  unless  otherwise  stated. 
This  approach  simplifies  the  analysis  and  suffices  to  demonstrate  the  utility  of  applying  a 
graph’s  coarsest  equitable  partition  to  improve  the  PageRank  algorithm’s  performance. 
An  arbitrary  graph,  G,  contains  n  =  |f|  vertices  and  defines  an  adjacency  matrix.  A,  that 


contains  entries.  G’s  coarsest  equitable  partition,  B,  contains  b  =  k  =  B  blocks,  where 


b<n,  B  =  \b^,b^,...,b^^^,A  and  «  =  ^'^JZ),.|  =  |r|. 


Given  the  number  of  vertices,  n,  finding  the  coarsest  equitable  partition  requires 
0{^n^  -logn)  time  using  the  method  described  in  Section  2. 3. 3.2.  Furthermore,  given  the 
PageRank  matrix,  S,  obtained  using  some  arbitrary  scaling  value,  a,  the  lower  bound  on 
using  the  power  method  to  obtain  the  PageRank  vector,  x,  is  -logn),  a  new  result 
derived  in  Section  3.2.  The  upper  bound  on  using  the  power  method  to  obtain  an  arbitrary 
precision,  r,  in  the  PageRank  vector,  x,  is  t),  where  t  <  log|^(s)  ^  -  log„  r,  as  was 

described  in  Section  2. 5.2.2.  Thus,  the  PageRank  algorithm,  which  is  essentially  a  power 
method  variant  has  a  lower  and  upper  bound  of  Q.{n^  ■  log n)  and  0{n^  respectively. 

The  coarsest  equitable  partition  can  be  determined  in  o[n^  -logn)  time  using  the 

algorithm  described  in  Section  2. 3. 3.2.  Thus,  a  PageRank  algorithm  variant  that  applies 
the  coarsest  equitable  partition  increases  its  execution  time  bounds  by  a  -logn  term. 
Thus,  such  a  PageRank  algorithm  variant  yields  the  lower  and  upper  bounds,  2-n^  ■  logn 
and  ■  log  n  +  n^  -t,  respectively,  which  are  Q  •  log  n)  and  O  •  log  n  +  . 


86 


Therefore,  any  algorithm  that  applies  the  coarsest  equitable  partition  to  reduce  the 
execution  time  of  the  PageRank  algorithm  must  recoup  the  cost  of  initially  computing  the 
partition,  i.e.,  the  -logn  term.  Such  efficiencies  can  be  obtained  by  observing  vertices 
contained  in  some  arbitrary  block,  b.,  yield  equal  PageRank  values,  where  |^,  |-1  of  the 
dot  products  are  being  (unnecessarily)  computed.  For  instance,  the  PageRank  algorithm’s 
lower  bound,  Q.{n^  -Xogn^,  can  be  written  as  Q|^|’_|^'|Z);|-n-log«j,  i.e.,  ^ 
dot  products  of  length  n  for  logn  iterations.  Similarly,  the  PageRank  algorithm’s  upper 
bound,  can  be  equivalently  written  as 

Hence,  a  PageRank  algorithm  variant  that  applies  the  coarsest  equitable  partition 
yields  the  lower  bound,  Q •  log n  +  \-n- log « j ,  and  similarly,  the  upper  bound, 

o{n^  +  Thus,  eliminating  n^-logn  or  more  operations  will  reduce 

the  time  required  to  obtain  the  PageRank  vector.  The  ProductRank  algorithm  described  in 
Section  4.3  eliminates  \b\-\  dot  products  from  block  b.  of  a  coarsest  equitable  partition. 

The  QuotientRank  algorithm  described  and  analyzed  in  Chapter  5  uses  a  quotient  matrix 
to  reduce  the  time  needed  to  obtain  the  PageRank  vector  more  dramatically. 

Finally,  decreasing  r  increases  precision  by  increasing  the  executed  number  of 
power  method  iterations.  However,  increasing  precision  cannot  ensure  vertices  contained 
in  the  same  block  have  equal  computed  PageRank  values.  The  three  algorithms  described 
in  Chapters  4  and  5  guarantee  each  block’s  vertices  have  the  same  PageRank  values.  The 
latter  two  algorithms  often  compute  the  PageRank  values  more  efficiently. 
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IV.  Reducing  Equitable  Differences  and  Dot  Products 


4.1.  Overview 

The  relationship  of  the  PageRank  vector  to  the  coarsest  equitable  partition  shown 
in  Chapter  3  yields  several  methods  of  improving  the  PageRank  algorithm’s  performance 
if  the  graph’s  coarsest  equitable  partition  is  non-discrete,  i.e.,  contains  one  or  more  blocks 
composed  of  multiple  vertices.  For  instance,  the  two  algorithms  described  in  this  chapter 
leverage  that  vertices  contained  in  the  same  block  of  the  coarsest  equitable  partition  have 
identical  iterated  dot  products  up  to  a  permutation  of  the  summed  intermediate  products. 

The  first  algorithm,  AverageRank,  replaces  the  computed  PageRank  value  of  each 
vertex  with  the  average  PageRank  value  of  vertices  contained  in  the  corresponding  block. 
Thus,  vertices  contained  in  the  same  block  will  receive  the  same  PageRank  value,  which 
eliminates  any  numerical  differences  in  the  computed  PageRank  values  of  such  vertices. 
The  second  algorithm,  ProductRank,  computes  one  dot  product  for  each  block  contained 
in  the  coarsest  equitable  partition  during  each  power  method  iteration.  The  ProductRank 
algorithm  guarantees  vertices  contained  in  the  same  block  have  the  same  PageRank  value 
and  reduces  the  execution  time  needed  to  compute  the  PageRank  vector. 

Both  algorithms  ensure  vertices  contained  in  the  same  block  have  equal  PageRank 
values  if  the  power  method  used  to  determine  the  PageRank  vector  is  terminated  after  an 
arbitrary  iteration.  The  ProductRank  algorithm  also  reduces  the  time  needed  to  obtain  the 
PageRank  vector  by  only  computing  certain  dot  products  and  thus  is  more  useful  than  the 
AverageRank  algorithm.  Both  algorithms  are  superseded  by  the  QuotientRank  algorithm 
described  in  Chapter  5,  which  uses  significantly  more  robust  techniques  to  further  reduce 
the  time  needed  to  compute  the  PageRank  vector. 
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4.2.  Eliminating  Equitable  PageRank  Differences 

4.2.1.  Numerical  Differences  and  Equitable  Vertices 

Vertices  that  are  contained  in  the  same  block  of  the  coarsest  equitable  partition  of 

an  arbitrary  graph  must  have  equal  PageRank  values.  However,  the  PageRank  values  may 
differ  if  the  values  are  computed  using  finite-precision  arithmetic,  where  such  differences 
induce  an  invalid  vertex  ordering.  Such  differences  may  occur  even  after  many  iterations 
of  the  power  method  have  been  performed  to  obtain  the  PageRank  vector.  The  following 
example  is  based  on  the  tree  shown  in  Figure  44  that  yields  the  adjacency  matrix  listed  in 

Table  35  and  the  coarsest  equitable  partition,  [{a,c,g,z},  {Z?,/?},  {e}]. 


Figure  44.  9-Vertex  Tree:  A  Graph  Yielding  a  4-Block  Equitable  Partition 


Table  35.  A  9- Vertex  Tree’s  Adjacency  Matrix 


a 

b 

c 

d 

e 

/ 

g 

h 

i 

a 

0 

1 

0 

0 

0 

0 

0 

0 

0 

b 

1 

0 

1 

0 

1 

0 

0 

0 

0 

c 

0 

1 

0 

0 

0 

0 

0 

0 

0 

d 

0 

0 

0 

0 

1 

0 

0 

0 

0 

e 

0 

1 

0 

1 

0 

1 

0 

1 

0 

f 

0 

0 

0 

0 

1 

0 

0 

0 

0 

g 

0 

0 

0 

0 

0 

0 

0 

1 

0 

h 

0 

0 

0 

0 

1 

0 

1 

0 

1 

i 

0 

0 

0 

0 

0 

0 

0 

1 

0 
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This  tree’s  PageRank  vector  for  a  =  0.85  is  listed  in  Table  36(a),  where  sorting  its 
entries  in  descending  order  in  Table  36(b).  Since  the  tree’s  coarsest  equitable  partition  is 

[_{a,c,g,i],{b,h],{dj],{e]\,  vertices  {a,c,  g,i]  should  have  the  same  PageRank  value. 

The  PageRank  vector  listed  in  Table  36(a)  initially  confirms  this  expected  behavior,  since 
vertices  |a,c,g,/}  have  an  equal  PageRank  value  up  to  the  fourth  decimal  place,  0.0682. 

However,  close  inspection  shows  the  PageRank  values  of  vertices  g  and  i  are  1.388 -lO  '^ 
greater  than  the  PageRank  values  of  vertices  a  and  c.  The  difference  occurs  in  the  last  bit 
of  the  double-precision  format  specified  by  the  IEEE  754  standard  [ISB85].  The  value, 
1.388-10  '^,  is  small  in  magnitude  and  induces  a  false  order  on  vertices  {a,c,g,i},  where 

vertices  g  and  i  erroneously  receive  a  higher  PageRank  value  than  vertices  a  and  c. 

In  this  example,  rounding  to  four  decimal  places  ensures  vertices  contained  in  the 

same  block  have  equal  PageRank  values.  However,  simply  rounding  the  PageRank  vector 

cannot  guarantee  such  vertices  have  equal  PageRank  values.  In  particular,  the  computed 

PageRank  values  of  vertices  contained  in  each  block  may  be  above  or  below  the  rounding 

point  and  thus  may  receive  different  rounded  PageRank  values. 

Table  36.  A  9-Vertex  Tree’s  PageRank  Vector,  a  =  0.85 
(a)  PageRank  Vector  (b)  Sorted  PageRank  Vector 


a 

0.0682 

b 

0.1818 

c 

0.0682 

d 

0.0659 

e 

0.2318 

f 

0.0659 

g 

0.0682 

h 

0.1818 

i 

0.0682 

0.2318 

e 

0.1818 

b 

0.1818 

h 

0.0682 

g 

0.0682 

i 

0.0682 

a 

0.0682 

c 

0.0659 

d 

0.0659 

f 
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4.2.2.  AverageRank:  An  Algorithm  for  Eliminating  Equitable  Differences 

Fortunately,  differences  in  the  computed  PageRank  values  of  vertices  contained  in 

the  same  block  of  the  coarsest  equitable  partition  can  be  easily  eliminated.  The  algorithm 
listed  in  Figure  45  sets  the  PageRank  value  of  each  block’s  vertices  to  the  average  of  their 
computed  values  without  internally  modifying  the  PageRank  algorithm  (cf  Section  2.5). 
First,  the  PageRank  vector  is  determined  on  line  3  using  the  PageRank  algorithm. 

The  coarsest  equitable  partition  on  line  5  is  obtained  using  any  method  described 
in  Section  2.3.3,  e.g.,  1-D  Weisfeiler-Lehman  stabilization.  PageRank  values  differences 
among  vertices  in  the  same  block  are  eliminated  on  lines  6-11  by  setting  their  PageRank 
values  to  their  corresponding  average  PageRank  value.  The  median  could  be  used  in  lieu 
of  the  average,  but  determining  the  median  PageRank  value  requires  0(«  logn)  time, 

whereas  determining  the  average  PageRank  value  only  requires  0(«)  time. 

1 .  fmdEquitablePageRank  ( A,  n,  a,  t) 

2.  #  compute  the  PageRank  vector 

3.  X  getPageRank(A,  n,  a,  r) 

4.  #  compute  the  coarsest  equitable  partition 

5.  R  <—  findCoarsestPartition  ( A,  n) 

6.  #  assign  average  PageRank  value  of  equitable  vertices 

7.  foreach  block,  b.  e  B 

8.  foreach  vertex,  Vj  e  b^ 

9.  x(y)^x(Z).) 

10.  end  foreach 

1 1 .  end  foreach 

12.  return  x 

13.  end  EquitablePageRank 

Figure  45.  AverageRank:  An  Algorithm  for  Ensuring  Equitable  PageRank  Values 
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4.2.3.  Complexity  Analysis 

Applying  the  power  method  to  an  arbitrary  PageRank  matrix,  S,  requires  at  least 
Q.{n^  •  log«)  time,  a  new  result  deseribed  in  Section  3.2.  The  coarsest  equitable  partition 

can  be  found  in  0[n^  logn)  time  by  applying  the  algorithm  described  in  Section  2. 3. 3.2. 

Thus,  the  power  method’s  lower  bound  equals  the  upper  bound  on  computing  the  coarsest 
equitable  partition.  Obtaining  the  average  PageRank  value  of  the  vertices  contained  in  the 

same  block  requires  time  and  is  dominated  by  -logn.  Thus,  the  lower 

bound  on  the  AverageRank  algorithm  is  l-rP'  -logn  +  n,  which  is  Q.{n^  Xogn^. 

The  upper  bound  on  applying  the  power  method  to  S  is  0{n^  t),  where  t  denotes 

the  number  of  power  method  iterations  and  t  <  log^  r  (cf  Section  2.5.2. 1).  Therefore,  the 
power  method’s  upper  bound  exceeds  the  complexity  of  computing  the  coarsest  equitable 
partition,  since  log^  r  >  log  n  for  most  practical  values  of  n,  r,  and  a.  Thus,  by  combining 

the  upper  bounds  of  applying  the  power  method,  0{n^  finding  the  coarsest  equitable 

partition,  o(«^  log«),  and  obtaining  mean  PageRank  values,  0(n),  the  upper  bound 

on  the  AverageRank  algorithm  is  •  log«  +  -  t  +  n,  which  is  0(^n^  ■  log«  +  •  t). 

The  AverageRank  algorithm  ensures  each  block’s  vertices  receive  equal  PageRank 
values,  but  costs  ■  log«  more  time  than  the  PageRank  algorithm.  The  ProductRank  and 
QuotientRank  algorithms  described  in  Section  4.3  and  Chapter  5,  respectively,  provide 
this  same  assurance  and  decrease  the  PageRank  algorithm’s  execution  time  if  the  graph’s 
coarsest  equitable  partition  is  non-discrete. 
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4.3.  Eliminating  Equitable  PageRank  Dot  Products 

4.3.1.  Excess  Dot  Products  and  Equitable  Vertices 

The  AverageRank  algorithm  described  in  Section  4.2.2  eliminates  differences  in 

PageRank  value  between  vertices  contained  in  the  same  block  of  the  coarsest  equitable 
partition  and  increases  the  PageRank  algorithm’s  execution  time.  Conversely,  the  method 
described  in  Section  4.3.2,  the  ProductRank  algorithm  ensures  vertices  in  the  same  block 
have  equal  PageRank  values  and  decreases  the  PageRank  algorithm’s  execution  time. 

For  example,  the  house  graph  shown  in  Figure  46(a)  yields  the  coarsest  equitable 

partition,  [{c,<i},{n},{6,e}],  shown  in  Figure  46(b).  Since  vertices  b  and  e  are  contained 

in  the  same  block,  it  suffices  to  obtain  one  of  their  associated  dot  products  and  assign  its 
value  to  their  associated  PageRank  vector  entries  during  each  power  method  iteration. 

Multiplying  the  house  graph’s  PageRank  matrix  listed  in  Table  38  with  the  initial 
PageRank  vector,  the  normalized  all-ones  vector  shown  in  Table  39(a),  yields  the  updated 
PageRank  vector  shown  in  Table  39(b).  More  significantly,  the  intermediate  product  sums 
yielded  by  vertices  b  ande  are  0.455 -0.2 +  0.030 -0.2 +  0.455 -0.2 +  0.030 -0.2 +  0.3 13 -0.2 
and  0.455  •  0.2  +  0.313-  0.2  +  0.030  •  0.2  +  0.455  •  0.2  +  0.030  •  0.2,  respectively. 

Inspection  reveals  ordering  the  intermediate  products  yields  the  same  sorted  list, 
0.030 -0.2 +  0.030 -0.2 +  0.3 13 -0.2 +  0.455 -0.2 +  0.455 -0.2.  Vertices  c  and  d  also  yield  the 
same  intermediate  products  up  to  isomorphism,  since  vertices  c  and  d  are  also  contained 
in  the  same  block  of  the  coarsest  equitable  partition.  Hence,  a  second  dot  product  can  be 
eliminated  during  each  power  method  iteration.  The  ProductRank  algorithm  described  in 
Section  4.3.2  applies  this  technique  to  decrease  the  PageRank  algorithm’s  execution  time. 
The  potential  performance  improvement  is  formally  determined  in  Section  4.3.3. 
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a 


(a)  House  Graph  (b)  Coarsest  Equitable  Partition 

Figure  46.  House  Graph  and  Its  3-Block  Coarsest  Equitable  Partition 

Table  37.  House  Graph’s  Adjacency  and  Degree  Matrix 
(a)  A  (b)  D 


a 

b 

C 

d 

e 

a 

0 

1 

0 

0 

1 

b 

1 

0 

1 

0 

1 

c 

0 

1 

0 

1 

0 

d 

0 

0 

1 

0 

1 

e 

1 

1 

0 

1 

0 

a 

b 

C 

d 

e 

a 

2 

0 

0 

0 

0 

b 

0 

3 

0 

0 

0 

c 

0 

0 

2 

0 

0 

d 

0 

0 

0 

2 

0 

e 

0 

0 

0 

0 

3 

Table  38.  House  Graph’s  Stochastic  PageRank  Matrix,  S,  a  =  0.85 


a 

b 

c 

d 

e 

a 

0.030 

0.313 

0.030 

0.030 

0.313 

b 

0.455 

0 . 030 

0.455 

0 . 030 

0.313 

c 

0.030 

0.313 

0 . 030 

0.455 

0 . 030 

d 

0.030 

0.030 

0.455 

0.030 

0.313 

e 

0.455 

0.313 

0.030 

0.455 

0.030 

Table  39.  Initial  PageRank  Power  Method  Iterations  of  the  House  Graph 


jSl 

(a)  ^0  =  — 


a 

0.2 

b 

0.2 

c 

0.2 

d 

0.2 

e 

0.2 

(b)x 

 S-Xg 

S(S-Xo) 

a 

0.143 

b 

0.257 

c 

0 .172 

d 

0 . 172 

e 

0.257 

(c)  Xj 


S-Xj 

S(s-x,) 


a 

0.175 

b 

0.237 

c 

0.176 

d 

0 . 176 

e 

0.237 

(d)  Xj  = 


Sx, 


S(s-X2) 


a 

0.164 

b 

0.246 

c 

0 .172 

d 

0 . 172 

e 

0.246 
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4.3.2.  ProductRank:  An  Algorithm  for  Eliminating  Equitable  Dot  Products 

In  the  AverageRank  algorithm  described  in  Section  4.2.2,  the  PageRank  value  of 

each  block’s  vertices  was  set  to  their  average  PageRank  value.  That  method  exploits  the 
relationship  established  in  Section  3.4.2,  that  such  vertices  yield  identical  dot  products  up 
to  a  permutation  and  thus  have  equal  PageRank  values.  In  that  algorithm,  the  average  is 
obtained  after  the  power  method  is  terminated,  i.e.,  no  internal  modifications  are  made  to 
the  PageRank  algorithm.  The  iterated  dot  product  relationship  is  further  exploited  in  this 
section  to  develop  an  algorithm  that  reduces  the  time  needed  to  compute  the  PageRank 
vector  by  computing  one  dot  product  for  each  block  during  each  power  method  iteration. 
However,  to  achieve  that  result,  the  PageRank  algorithm  must  be  internally  modified. 

The  unaltered  PageRank  algorithm  listed  in  Figure  33  is  repeated  in  Figure  47(a). 
Since  lines  1-10  are  unchanged  in  the  ProductRank  algorithm  listed  in  Figure  47(b),  they 
are  not  listed.  The  first  change  is  to  obtain  the  coarsest  equitable  partition  on  line  1 1  using 
the  findCoarsestPartition  function.  That  function  can  implement  any  of  the  three  methods 
described  in  Section  2.3.3  for  computing  the  graph’s  coarsest  equitable  partition. 

The  next  change  is  to  replace  the  matrix  multiplication  step  in  line  16,  x  <—  S  x, 
of  the  original  PageRank  algorithm  with  lines  17-24  in  the  ProductRank  algorithm.  The 
lines  are  similar  to  lines  6-11  in  Figure  45,  which  assign  the  average  PageRank  value  of 
each  block’s  vertices  to  those  same  vertices.  In  this  algorithm,  the  dot  product  of  the  first 
vertex  in  every  block  is  computed  on  lines  19  and  20.  The  resulting  updated  PageRank 
value  is  subsequently  assigned  to  each  vertex  in  that  block  using  the  loop  on  lines  21-23. 
The  ProductRank  algorithm’s  remaining  lines,  25-31,  are  identical  to  lines  17-23  in  the 
original  PageRank  algorithm  listed  in  Figure  47(a). 
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1. 

getPageRank(A,  n,  a,  r) 

1. 

getPageRank(A,  n,  a,  r) 

2. 

#  perturb  adjacency  matrix 

; 

j  j 

11. 

#  get  coarsest  equitable  partition 

3. 

12. 

B  =  fmdCoarsestPartition  ( A,  n) 

4. 

D  diag(d) 

13. 

#  iterate  power  method 

5. 

#  apply  PageRank  perturbation 

14. 

while!  Is -X I  > r ) 

6. 

S<-a-A-D-‘+(l-a)/« 

VII  Il2  f 

15. 

#  save  PageRank  vector 

7. 

#  initialize  vectors  and  counter 

16. 

s  <—  X 

8. 

X  <—  !"’'/« 

17. 

#  update  PageRank  vector 

9. 

s  <-  0"’‘ 

18. 

foreach  block,  b.  e  B 

10. 

z  <—  0 

19. 

11. 

#  iterate  power  method 

20. 

X  S.  „  •  X 

12. 

while  (^lls-xl^  >  r) 

13. 

#  save  PageRank  vector 

21. 

foreach  vertex,  v  g  b. 

14. 

S  X 

22. 

X,  <-x 

23. 

end  foreach 

15. 

#  update  PageRank  vector 

24. 

end  foreach 

16. 

x<— Sx 

25. 

#  normalize  PageRank  vector 

17. 

#  normalize  PageRank  vector 

26. 

x-^x/y  X 

18. 

x^x/2]x 

19. 

#  increment  loop  counter 

27. 

#  increment  loop  counter 

20. 

z  <—  z  +  1 

28. 

z  <—  z  +  1 

21. 

end  while 

29. 

end  while 

22. 

return  x 

30. 

return  x 

23. 

end  PageRank 

31. 

end  PageRank 

(a)  PageRank  Algorithm 

(b)  ProductRank  Algorithm 

Figure  47.  ProductRank:  An  Algorithm  for  Eliminating  Equitable  Dot  Products 
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4.3.3.  Complexity  Analysis 

The  AverageRank  algorithm  described  in  Section  4.2.2  ensures  vertices  contained 
in  the  same  block  have  equal  PageRank  values.  The  ProductRank  algorithm  described  in 
Section  4.3.2  similarly  ensures  vertices  contained  in  the  same  block  have  equal  PageRank 
values.  However,  the  ProductRank  algorithm  reduces  the  number  of  operations  needed  to 
compute  the  PageRank  vector  if  a  graph’s  coarsest  equitable  partition  is  non-discrete.  The 
performance  gain  is  obtained  by  only  computing  one  dot  product  for  every  block  during 
each  power  method  iteration.  This  method  succeeds  since  vertices  contained  in  the  same 
block  yield  equal  dot  products  up  to  a  permutation  of  their  intermediate  multiplications. 
Therefore,  the  ProductRank  algorithm  reduces  the  time  needed  to  compute  the  PageRank 
vector  if  the  coarsest  equitable  partition  is  non-discrete.  Thus,  the  ProductRank  algorithm 
listed  in  Section  4.3.2  supersedes  the  AverageRank  algorithm  listed  in  Section  4.2.2. 

For  example,  it  was  shown  in  Section  4.3.1  that  applying  the  PageRank  algorithm 
to  the  house  graph  requires  five  dot  products  to  be  computed  during  each  power  method 
iteration,  which  requires  a  total  of  25  multiplications  and  20  additions.  The  house  graph’s 
coarsest  equitable  partition  contains  three  blocks,  where  two  blocks  contain  two  vertices, 
thus,  two  dot  products  can  be  eliminated  from  each  block  in  each  power  method  iteration. 

Each  iteration  performed  by  the  AverageRank  algorithm  computes  5  dot  products, 
which  requires  25  multiplications  and  20  additions.  Conversely,  each  iteration  performed 
in  the  AverageRank  algorithm  only  computes  3  dot  products,  requiring  1 5  multiplications 
and  12  additions.  Thus,  if  three  iterations  are  executed,  ensuring  three  bits  of  precision, 
the  AverageRank  algorithm  performs  (25-i-20)-3  =  135  operations,  but  the  ProductRank 

algorithm  only  performs  (l5-i-12)-3  =  81  operations. 
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If  «  >  0.5  and  r  <  1/n,  the  PageRank  algorithm  needs  at  least  Q.{n^  -logn)  time, 

as  shown  in  Section  3.2.  Obtaining  the  coarsest  equitable  partition  requires  logn) 

time,  as  noted  in  Section  2. 3. 3.2.  Hence,  the  ProductRank  algorithm  obtains  the  coarsest 
equitable  partition  in  Q.{n^  -logn)  time  and  applies  the  power  method  in  -logn) 

time,  yielding  an  overall  lower  bound  of  Q.{n^  ■  logn). 

The  PageRank  algorithm’s  upper  bound  is  0{n^  t),  where  t  <  log^  r  denotes  the 
maximum  number  of  required  power  method  iterations,  as  described  in  Section  2.5.2.  The 
ProductRank  algorithm  computes  the  graph’s  coarsest  equitable  partition  in  logn) 

time  and  applies  the  power  method  in  0{n^  time,  yielding  an  overall  upper  bound  of 
-Xogn  +  n^  -t). 

Sharper  bounds  can  be  derived  by  accounting  for  the  dot  products  eliminated  by 
applying  the  coarsest  equitable  partition.  If  a  block  contains  s  vertices,  5  - 1  dot  products, 
or  (5-l)-n  multiplications  and  (^-l)-(n-l)  additions  are  saved  in  each  power  method 

iteration.  If  t  iterations  are  performed,  (.s-l)  (2-n-l)  t  operations  are  eliminated. 

A  coarsest  equitable  partition,  B,  contains  b  =  |R|  blocks,  where  n  =  Only 

one  dot  product  is  computed  for  each  block,  therefore,  the  algorithm  yields  a  lower  bound 
of  Q •  log  n  +  h  •  n  •  log  n) ,  where  •  log  n  operations  are  used  to  compute  the  coarsest 
equitable  partition  and  b-n-  log  n  operations  are  used  to  obtain  b  dot  products  of  length  n 
for  at  least  log n  iterations.  Similarly,  the  upper  bound  is  O [n^  •  log n  +  b-n-t^. 
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For  example,  the  4x4  grid  graph  shown  in  Figure  48  yields  the  coarsest  equitable 
partition  listed  in  Table  40.  The  partition  contains  3  blocks,  hence,  only  3  dot  products  are 
computed  during  every  power  method  iteration.  Eliminating  13  of  16  dot  products  saves 
13  (16  +  15)  =  403  floating-point  operations.  Computing  the  remaining  three  dot  products 

only  requires  3  - (l 6  +  15)  =  93  floating-point  operations. 

Obtaining  the  coarsest  equitable  partition  requires  as  many  as  16^  •  log2 16  =  1024 
operations.  Since  |” 1024/403]  =  3,  at  least  four  power  method  iterations  must  be  executed 


before  the  PageRank  algorithm’s  execution  time  is  reduced.  However,  since  logj  n  =  4,  at 
least  that  many  iterations  are  performed  to  obtain  the  PageRank  vector.  Thus,  if  or  >  0.5 
and  r  <  1/n  <  1/16  <  0.0625,  the  Product  Rank  algorithm  shown  in  Figure  47(b)  needs  less 
time  than  the  PageRank  algorithm  to  compute  this  grid  graph’s  PageRank  vector. 
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Figure  48.  4x4  Grid:  A  Graph  Yielding  a  3 -Block  Equitable  Partition 
Table  40.  4x4  Grid  Graph’s  3-Block  Coarsest  Equitable  Partition 


i 

Block  e  B 

1 

{1,4,13,16} 

2 

{2,3,5,8,9,12,14,15} 

3 

{6,7,10,11} 

99 


4.3.4.  Algorithm  Applicability 

The  decision  to  use  the  ProductRank  algorithm  listed  in  Figure  47(b)  or  the  more 
efficient  QuotientRank  algorithm  developed  in  Chapter  5  to  obtain  the  PageRank  vector 
hinges  on  several  factors,  where  this  section  prefaces  the  more  in-depth  analysis  provided 
in  Section  5.7.  The  key  factors  are  the  number  of  blocks,  b,  in  a  graph’s  coarsest  equitable 
partition,  B,  PageRank  scaling  value,  a,  and  required  precision  in  the  PageRank  vector,  r. 
To  simplify  the  analysis,  it  is  assumed  a  >  0.5  and  r  <  1/n,  where  n  =  |f|,  the  number  of 
vertices  in  the  input  graph,  G. 

Loosely  stated,  if  b  is  sufficiently  less  than  n,  applying  either  method  dramatically 
reduces  the  time  needed  to  obtain  the  PageRank  vector.  However,  if  b  is  sufficiently  large 
with  respect  to  n,  where  b<n,  applying  the  ProductRank  or  QuotientRank  algorithm  can 

increase  the  time  needed  to  obtain  a  PageRank  vector  by  a  factor  of  two,  i.e.,  2-n^  ■  log  n. 
The  worst  case  occurs  if  the  coarsest  equitable  partition  is  discrete,  i.e.,  b  =  n,  since  the 
time  needed  to  obtain  the  coarsest  equitable  partition  equals  the  lower  bound  of  obtaining 
the  PageRank  vector,  -logn.  However,  the  PageRank  algorithm’s  upper  bound  can  be 
significantly  reduced  if  the  graph’s  coarsest  equitable  partition  is  discrete,  i.e.,  if  b<n. 

Unfortunately,  it  is  impossible  to  assess  a  priori  if  the  graph  yields  a  non-discrete 
coarsest  equitable  partition  containing  a  sufficiently  small  number  of  blocks,  b.  Assuming 
b  is  sufficiently  small  with  respect  to  n,  the  PageRank  vector  is  obtained  more  efficiently 
using  either  the  ProductRank  algorithm  listed  in  Section  4.3.2  or  QuotientRank  algorithm 
constructed  in  Chapter  5.  Although  both  algorithms  are  more  efficient  than  the  PageRank 
algorithm,  the  ProductRank  algorithm  is  easier  to  implement,  whereas  the  QuotientRank 
algorithm  more  dramatically  reduces  the  time  needed  to  compute  the  PageRank  vector. 
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V.  Lifting  PageRank  Values 


5.1.  Overview 

A  more  dramatic  decrease  in  the  PageRank  algorithm’s  execution  time  is  obtained 
using  the  quotient  matrix  induced  by  the  graph’s  coarsest  equitable  partition,  as  shown  in 
Section  5.2.  An  example  that  applies  the  quotient  matrix  to  obtain  the  PageRank  vector  is 
constructed  in  Section  5.3.  The  corresponding  algorithm  is  described  in  Section  5.4  and 
its  analysis  is  contained  in  Section  5.5.  The  utility  of  the  quotient  graph  was  also  obtained 
using  different  tools  by  Boldi  et  al.  [BLS+06].  However,  they  did  not  develop  a  method 
similar  to  the  QuotientRank  algorithm  described  and  analyzed  in  Sections  5.4  and  5.5. 

For  example,  the  coarsest  equitable  partition  of  the  graph  shown  in  Figure  49(a)  is 

[{2,4,6,8,10,12},  {1,3,5,7,9,11}]  and  induces  the  quotient  graph  shown  in  Figure  49(b). 

Every  vertex  in  each  block  is  linked  to  two  vertices  in  the  other  block,  hence,  the  pair  of 
2-edges,  and  odd-labeled  vertices  are  linked  to  one  odd-labeled  vertex,  hence,  the  1-loop. 
As  will  be  shown  in  this  chapter,  the  PageRank  vector  can  be  obtained  more  efficiently 
by  lifting  the  dominant  eigenvector  of  a  2x2  quotient  matrix  (cf  Section  2.3.5),  instead 
of  computing  the  dominant  eigenvector  of  the  original  graph’s  12x12  PageRank  matrix. 


(a)  Pseudo-Benzene  (b)  Quotient  Graph 

Figure  49.  Pseudo-Benzene:  A  Graph  Yielding  a  2-Block  Equitable  Partition  [StT99] 
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5.2.  Quotient  Computations 

As  described  in  Section  2.3.5,  given  some  arbitrary  matrix,  M,  and  some  similarly 
arbitrary  partition,  B,  the  entries  in  the  quotient  matrix,  Q,  induced  by  B  correspond  to  the 
average  row  sums  in  M  with  respect  to  B.  Formally,  Q  is  obtained  by  computing  [Hae95] 

q  =  (b"-b)’‘-b"-m-b 

=  B^  M  B, 

where  B  is  the  characteristic  block  matrix  whose  n  =  \v\  rows  correspond  to  the  vertices 

contained  in  V  and  b  =  |5|  columns  correspond  to  the  blocks  contained  in  B,  respectively. 

Given  an  arbitrary  graph,  G,  and  its  associated  adjacency  matrix.  A,  it  is  critical  to 
construct  the  appropriate  quotient  matrix,  Q.  For  instance,  an  incorrect  approach  is  to  first 
compute  the  quotient  matrix  based  on  A,  where 

=  B^  A  B, 

and  subsequently  apply  the  PageRank  perturbation,  where 

Sq  =«-Q-DQ‘+(l-«)/n.  (33) 

However,  applying  the  PageRank  algorithm  to  Sq  yields  the  correct  PageRank 


vector  if  a  =  \.  Given  an  adjacency  matrix.  A,  the  correct  approach  for  any  value  of  a,  as 
shown  in  this  section,  is  to  apply  the  traditional  PageRank  perturbation  to  A,  where 

=«-A-D;'+(l-«)/n.  (34) 

The  correct  quotient  matrix,  Q,  subsequently  can  be  obtained  by  computing 


Q  =  (B^  B)  ‘  B^  B 

=  B  B. 
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To  show  the  PageRank  vector,  x,  associated  with  a  PageRank  matrix,  S  =  S^,  can 

be  lifted  from  a  quotient  matrix,  Q,  using  (35),  three  key  results  must  be  obtained.  First,  it 

must  be  established  Q  and  S  yield  the  same  dominant  eigenvalue,  one.  Second,  it  must  be 

shown  a  unique  eigenvector,  r,  is  associated  with  Q’s  dominant  eigenvalue,  one.  Finally, 

it  must  be  shown  that  S’s  PageRank  vector,  x,  can  be  lifted  from  Q’s  eigenvector,  r. 

The  quotient  matrix,  Q,  yielded  by  applying  (35)  is  often  not  stochastic.  However, 

Q’s  row  sums  equal  one  or  more  of  S’s  corresponding  row  sums.  Moreover,  Q’s  diagonal 

entries,  as  well  as  many  of  Q’s  non-diagonal  entries,  may  equal  zero,  i.e.,  Q  is  often  not 

primitive  or  irreducible.  Therefore,  Q  does  not  satisfy  the  conditions  of  the  Perron  or  the 

Perron-Frobenius  theorems,  hence,  they  cannot  be  directly  applied  to  ensure  that  Q  yields 

a  unique  eigenvector  associated  with  the  dominant  eigenvalue,  one.  Instead,  more  robust 

machinery,  namely,  the  interlacing  and  lifting  properties  described  in  Section  2.3.5  must 

be  used  to  establish  the  eigen  decomposition  relationships  between  a  PageRank  matrix,  S, 

and  the  quotient  matrix,  Q,  induced  by  S’s  coarsest  equitable  partition,  B. 

Theorem  4  The  dominant  eigenvalue  of  an  equitable  quotient  matrix,  Q, 
of  a  positive  stochastic  matrix,  S,  equals  S’s  dominant  eigenvalue,  one. 

Proof  Given  the  quotient  matrix,  Q,  induced  by  an  arbitrary  partition,  B, 
of  an  arbitrary  matrix,  M,  the  eigenvalues  of  M  and  Q  are  interlaced,  such 
that  (M)  <  Z.  (Q)  <  A.  (M),  |/(.|  <  |/l,+i|.  If  B  is  equitable,  e.g.,  if  B  is 
the  coarsest  equitable  partition  of  M  =  S,  Q’s  eigenvalues  are  some  subset 
of  S’s  eigenvalues.  If  S  is  the  weighted  adjacency  matrix  of  some  strongly 
connected  graph,  G,  Q  and  S  must  have  the  same  dominant  eigenvalues, 
i.e.,  (Q)  =  ^  (S)  [God93,  CRS97]. 

A  stochastic  positive  matrix,  e.g.,  a  PageRank  matrix,  S,  yields  the 
dominant  eigenvalue,  one.  Q  may  not  be  stochastic.  However,  applying 
the  Perron  and  Perron-Frobenius  theorems  to  the  PageRank  matrix,  S,  and 
the  interlacing  property  to  an  equitable  quotient  matrix,  Q,  shows  that  Q 
also  yields  the  dominant  eigenvalue,  one,  i.e.,  (Q)  =  2[(S)  =  1.  ■ 
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The  next  step  establishes  Q  has  a  unique  eigenveetor,  r,  whose  entries  are  a  subset 

of  the  entries  in  S’s  dominant  eigenvector,  x,  where  x  can  be  obtained  more  efficiently  by 

simply  lifting  it  from  Q’s  dominant  eigenvector,  r. 

Theorem  5  The  entries  contained  in  the  dominant  eigenvector,  r,  yielded 
by  the  quotient  matrix,  Q,  defined  by  the  coarsest  equitable  partition,  B,  of 
a  positive  stochastic  matrix,  S,  are  a  subset  of  the  entries  contained  in  S’s 
dominant  eigenvector,  x.  Each  entry  in  x  can  be  obtained  by  simply  lifting 
it  from  Q’s  dominant  eigenvector,  r. 

Proof  A  coarsest  equitable  partition,  B,  is  equitable,  therefore,  the  lifting 
identity,  S-x  =  S-  B-  r  =  B-  Q-  r  =  B-  r  =  x,  ensures  that  each  entry  in  Q’s 
dominant  eigenvector,  r,  also  equals  some  entry  in  S’s  PageRank  vector,  x. 

Applying  the  Perron-Frobenius  theorem  ensures  elements  in  x  are 
positive  and  unique,  thus,  r’s  elements  are  positive  and  unique.  The  lifting 
identity,  x  =  B  -  r,  maps  entries  in  Q’s  eigenvector,  r,  to  S’s  eigenvector,  x. 

Each  of  B’s  n  non-zero  entries  equal  one,  thus,  lifting  is  simply  an  exercise 

in  memory  copying,  where  x  =  B  •  r  and  r  =  B^  •  x  =  •  b)  •  B^  •  x  .  ■ 

Thus,  given  some  arbitrary  graph,  G,  and  its  adjacency  matrix.  A,  the  PageRank 
vector  can  be  obtained  by  normalizing  the  dominant  eigenvector  of  the  PageRank  matrix, 
S  =  a  •  A •  D"'  +(l-a)/n.  G’s  PageRank  vector,  x,  can  be  constructed  more  efficiently  by 

lifting  it  from  Q’s  PageRank  vector,  r.  The  quotient  matrix,  Q,  is  yielded  by  applying  the 
block  matrix,  B,  defined  by  the  partition,  B,  to  the  PageRank  matrix,  S,  where 

Q  =  B^  S  B  =  (b^  b)  ‘  B^  S  B.  (36) 

Assuming  Q’s  dominant  eigenvector,  r,  is  available,  e.g.,  by  applying  the  power  method, 
S’s  dominant  eigenvector,  x,  is  obtained  by  simply  lifting,  or  copying,  it  from  r,  where 

x  =  B-r.  (37) 


Finally,  that  vector  is  normalized  by  its  sum,  yielding  the  PageRank  vector,  x,  where 

x  =  x/2]x  ^  = 
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5.3.  Lifting  the  House  Graph’s  PageRank  Vector 

The  example  in  this  section  illustrates  how  the  PageRank  vector  can  be  efficiently 

lifted  from  the  dominant  eigenvector  of  the  quotient  matrix.  The  house  graph  depicted  in 
Figure  50(a)  yields  the  coarsest  equitable  partition,  B  =  {a},  illustrated  in 

Figure  50(b).  Its  associated  block  matrix,  B,  is  listed  in  Table  41,  where  ‘I’s  reflect  B's 
vertex  block  membership.  The  intermediate  product,  N  =  B,  is  listed  in  Table  42(a), 
where  a  diagonal  entry  equals  its  corresponding  column  sum  from  B.  Reciprocating  all  of 
the  diagonal  entries  yields  its  matrix  inverse,  N"',  as  shown  in  Table  42(b). 


(a)  House  Graph  (b)  Coarsest  Equitable  Partition 
Figure  50.  House  Graph’s  3 -Block  Coarsest  Equitable  Partition 


Table  41.  Characteristic  Block  Matrix,  B 


{c,  d} 

{a} 

{b,  e} 

a 

0 

1 

0 

b 

0 

0 

1 

c 

1 

0 

0 

d 

1 

0 

0 

e 

0 

0 

1 

Table  42.  Characteristic  Block  Matrix  Products 
(a)  N  =  B^  -B,  N,  .  =  XB,,  (b)  N'*  =  (b^  -B)"’  ,  N:)  =  l/N,, 


{c,  d) 

{a} 

{b,  e} 

{c,  d} 

1/2 

0 

0 

{a} 

0 

1 

0 

{b,  e} 

0 

0 

1/2 

{c,  d) 

{a} 

{b,  e} 

{c,  d} 

2 

0 

0 

{a} 

0 

1 

0 

{b,e} 

0 

0 

2 
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The  house  graph’s  adjacency  matrix  is  listed  in  Table  43.  Given  the  scaling  factor, 
a  =  0.85,  the  house  graph  yields  the  PageRank  matrix  shown  in  Table  44.  Subsequently 


applying  the  quotient  matrix  identity, 

Q  =  B^  S  B  =  (b^  b)  ‘  S  B,  (39) 

yields  the  quotient  matrix,  Q,  listed  in  Table  45.  As  this  example  illustrates,  a  stochastic 
PageRank  matrix,  S,  may  yield  a  non- stochastic  quotient  matrix,  Q,  i.e.,  none  of  the  rows 
or  columns  in  Q  sum  to  one,  although  the  row  sums  remain  constant. 

Table  43.  House  Graph’s  Adjacency  Matrix,  A 


a 

b 

C 

d 

e 

a 

0 

1 

0 

0 

1 

b 

1 

0 

1 

0 

1 

c 

0 

1 

0 

1 

0 

d 

0 

0 

1 

0 

1 

e 

1 

1 

0 

1 

0 

Table  44.  House  Graph’s  PageRank  Matrix,  S,  a  =  0.85 


a 

b 

c 

d 

e 

a 

0.0300 

0.3133 

0.0300 

0.0300 

0.3133 

0.7167 

b 

0.4550 

0 . 0300 

0.4550 

0 . 0300 

0.3133 

1.2833 

c 

0.0300 

0.3133 

0 . 0300 

0.4550 

0.0300 

0.8583 

d 

0.0300 

0 . 0300 

0.4550 

0 . 0300 

0.3133 

0.8583 

e 

0.4550 

0.3133 

0.0300 

0.4550 

0.0300 

1.2833 

1 

Z 

Table  45.  Quotient  Matrix,  Q,  of  the  House  Graph’s  PageRank  Matrix,  S,  a  =  0.85 


{c,d} 

{a} 

{b,  e} 

{c,  d} 

0.4850 

0.0300 

0.3433 

0.8583 

{a} 

0.0600 

0.0300 

0.6267 

0.7167 

{b,  ej 

0.4850 

0 .4550 

0.3433 

1.2833 

1.0300 

0.5150 

1.3133 

Z 
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As  described  in  Section  2.3.5,  the  eigenvalues  of  the  quotient  matrix,  Q,  interlace 
the  eigenvalues  of  the  PageRank  matrix,  S.  Moreover,  since  a  coarsest  equitable  partition, 
B,  is  equitable,  Q’s  eigenvalues  are  a  subset  of  S’s  eigenvalues,  as  reflected  in  Table  46. 
The  dominant  eigenvectors  of  the  PageRank  matrix,  S,  and  quotient  matrix,  Q,  are  listed 
in  Tables  47(a)  and  (b),  respectively.  The  PageRank  vector  of  S  is  assumed  to  be  obtained 
by  multiplying  a  5  x  5  matrix  with  a  5  x  1  vector  during  each  power  method  iteration.  The 
PageRank  vector  of  the  quotient  matrix,  Q,  is  similarly  assumed  to  be  obtained  by  simply 
multiplying  a  3  x  3  matrix  with  a  3  x  1  vector  during  each  power  method  iteration. 

However,  S’s  PageRank  vector,  x,  can  be  obtained  by  simply  lifting  it  from  Q’s 
dominant  eigenvector,  x.  This  process  multiplies  Q’s  eigenvector,  r,  with  B,  as  reflected 
in  Table  47(c).  Normalizing  that  lifted  eigenvector  yields  the  PageRank  vector,  x,  listed  in 
Table  47(d),  which,  as  required,  equals  the  PageRank  vector  listed  in  Table  47(a). 

Table  46.  Eigenvalues  of  the  PageRank  and  Quotient  Matrices 
(a)  sort(|/l.(S)|)  (b)  sort(|/l. (Q)|) 


1.0000 

-0.7083 

-0.4250 

0.2833 

0.0000 


1.0000 

-0.4250 

0.2833 


Table  47.  Dominant  Eigenvectors  of  the  PageRank  and  Quotient  Matrices 


(a)  X  (b)  r 


a 

0.1681 

b 

0.2437 

c 

0 . 1723 

d 

0.1723 

e 

0.2437 

z 

1.0000 

{c,  d) 

0.2949 

{a} 

0.2878 

{b,  ej 

0 .4173 

I 

1.0000 

(c)  x  =  B  r 


a 

0.2878 

b 

0 .4173 

c 

0.2949 

d 

0.2949 

e 

0 .4173 

z 

1.7122 

(d)  x  =  x/2^x 


a 

0.1681 

b 

0.2437 

c 

0.1723 

d 

0 . 1723 

e 

0.2437 

z 

1.0000 
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5.4.  QuotientRank:  An  Algorithm  for  Lifting  PageRank  Vectors 

The  algorithm  listed  in  Figure  5 1  obtains  the  dominant  eigenvector  of  the  quotient 

matrix  induced  by  the  coarsest  equitable  partition  and  lifts  this  eigenvector  to  obtain  the 
PageRank  vector  of  the  input  matrix.  The  standard  PageRank  matrix,  S,  is  constructed 
with  respect  to  the  adjacency  matrix,  A,  on  lines  1-5.  The  coarsest  equitable  partition  is 
determined  on  lines  6-7  by  applying  one  of  the  methods  described  in  Section  2.3.3.  The 
characteristic  block  matrix,  B,  is  computed  on  lines  9-11,  and  subsequently  applied  to  the 
PageRank  matrix,  S,  to  obtain  the  quotient  matrix,  Q,  on  line  12.  The  power  method  is 
applied  to  Q  on  lines  13-27  to  obtain  its  dominant  eigenvector.  Any  vector  norm  could  be 
applied  on  line  24,  where  the  sum  norm  is  the  most  efficient  to  obtain.  The  final  step  is  to 
lift  Q’s  dominant  eigenvector,  r,  on  line  29,  where  normalizing  the  lifted  vector,  x,  with 
respect  to  its  sum  yields  the  PageRank  vector  of  the  input  matrix.  A,  on  line  30. 

It  is  assumed  that  an  arbitrary  adjacency  matrix.  A,  is  dense.  However,  the  block 
matrix,  B,  should  be  a  sparse  matrix.  As  indicated  on  lines  10  and  11,  B  has  a  lone  ‘1’  on 
each  row  that  reflects  vertex  membership  in  a  coarsest  equitable  partition’s  blocks.  Thus, 
one  of  the  intermediate  products,  N  =  B^  -B,  computed  on  line  12  is  a  diagonal  matrix, 
such  that  a  diagonal  entry  of  N  corresponds  to  the  number  of  vertices  in  some  block,  i.e.. 
Hence,  its  inverse,  N  '  =  (^B^  -b)  ,  is  obtained  by  reciprocating  each 
diagonal  entry,  i.e.,  N7j  =  1/N, .  Therefore,  line  12  can  be  written  as  Q  <—  N”'  •  B^  •  S  •  B, 

where  N“'  is  obtained  by  reciprocating  B’s  column  sums.  The  lifting  step  performed  on 
line  29,  x  =  B  •  r,  is  equivalent  to  simply  copying  eigenvalues  in  r  to  their  corresponding 
positions  in  the  PageRank  vector,  x. 
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1 .  getPageRank  ( A,  n,  a,  r) 

2.  #  perturb  adjacency  matrix 

3.  d,^y"A, 

4.  D<— diag(d) 

5.  — q^'A'D  —  cc^jn 

6.  #  get  coarsest  equitable  partition 

7.  5  <—  fmdCoarsestPartition  (  A,  n) 

8.  #  construct  block  and  quotient  matrices 

9.  b^\B\ 

10. 

11.  B,.^l,V/,y:v,e5. 

12.  Q<-(b^  b)  ‘  B^  S  B-^B^  S  B 

13.  #  initialize  vectors  &  iteration  counter 

14.  r  <- 1^7^ 

15. 

16.  z<-0 

17.  #  iterate  power  method 

18.  while (^||s  -  rl^  >  r) 

19.  #  save  eigenvector 

20.  s<— r 

21.  #  update  eigenvector 

22.  r<-Q  r 

23.  #  normalize  eigenvector 

24.  r<— r/^r 

25.  #  increment  loop  counter 

26.  z  ^ —  z  + 1 

27.  end  while 

28.  #  lift  and  normalize  PageRank  vector 

29.  x^B  r 

30.  x<— x/^x 

3 1 .  return  x 

32.  end  PageRank 

Figure  5 1 .  QuotientRank:  An  Algorithm  for  Lifting  PageRank  Vectors 
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5.5.  Complexity  Analysis 

Applying  the  PageRank  perturbation  on  lines  2-5  is  since  all  entries  must 

be  scaled  and  shifted.  Computing  the  coarsest  equitable  partition,  B,  on  lines  6-1  requires 
-logn)  time,  as  described  in  Section  2. 3. 3.2.  Constructing  the  block  matrix,  B,  on 

lines  9-11  from  the  coarsest  equitable  partition,  B,  requires  0(«)  time,  where  n  ones  are 

placed  at  B’s  rows  and  column  corresponding  to  the  vertex  block  membership  in  B.  Since 
the  nxb  matrix,  B,  only  contains  n  non-zero  entries,  it  is  stored  as  a  sparse  matrix,  which 
only  consumes  0(«)  space. 

The  quotient  matrix  construction,  Q  =  (^B^  b)  B^  S  B,  performed  on  line  12 
can  be  decomposed  into  two  steps,  Q  =  N  •  Z,  where  N"‘  =  (b^  •  b)'‘  and  Z  =  B^  •  S  •  B. 

The  first  step,  N"'  =  b)  ,  yields  a  diagonal  matrix  whose  diagonal  entries  equal  the 

reciprocal  of  the  number  of  vertices  contained  in  the  corresponding  block  in  B.  Thus,  the 
diagonal  entries  in  N"'  equal  B’s  reciprocated  column  sums.  Moreover,  since  N  =  B^  •  B 
is  a  diagonal  matrix,  its  inverse,  N"',  can  be  computed  in  0(n)  time. 

The  second  step,  Z  =  B^-S-B,  multiplies  a  bxn  matrix,  B^,  by  an  nxn  matrix, 
S,  yielding  the  bxn  product,  B^  •  S.  This  bxn  matrix  is  multiplied  by  a  nxb  matrix,  B, 
yielding  a  bxb  matrix,  Z  =  B^-S  B.  These  multiplications  are  0[b-n^^  and  0[b^  -n^, 
respectively,  where  b<n.  Thus,  if  5  is  a  discrete  partition,  which  is  the  worst  case,  these 
bounds  become  Since  B  contains  n  non-zero  entries,  storing  B  as  a  sparse  matrix 

reduces  the  bounds  to  0(«^)  and  O(Z)  n),  respectively,  yielding  0(«^)  if  6  =  u. 
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The  same  bounds  can  also  be  derived  using  the  tools  described  in  Section  2. 3. 5. 2. 
Given  an  arbitrary  vertex  partition,  B,  Q .  ^  ^  ^ ,  where  \<i,j<b,  \<k<n,  e  b. , 

and  e  bj.  The  complexity  of  this  summation  is  0{b^  -n^,  since  each  entry  in  the  bxb 
matrix,  Q,  may  consume  as  many  as  n  summations.  If  b  =  n,  which  is  the  worst  possible 
case,  the  best  upper  bound  that  can  be  obtained  initially  appears  to  be 

However,  since  B  is  assumed  to  be  the  coarsest  equitable  partition,  which  is  an 
equitable  partition,  the  summation  can  be  computed  more  efficiently.  In  particular,  each 
vertex  contained  in  block  b.  has  an  equal  number  of  neighbors  with  respect  to  block  bj, 

where  l<ij<b.  Therefore,  the  number  of  neighbors  with  respect  to  block  bj  must  only 

be  determined  for  a  single  vertex  in  block  h;,  i.e.,  z.  j  =  ^N[u)r\bj^  =  ^N[v)nbj^,  where 

l^i,j  <b,  u  vGb.,  and  A^(w)  denotes  w’s  neighborhood  (cf  Sections  2.3.2  and  2.3.3). 

Thus,  Q,.  =  z.j  •  for  some  arbitrary  vertex,  u  eb.,  and  some  arbitrary  neighbor  of  u, 

w  e  bj.  This  approach  yields  the  same  bound  on  building  Q,  (cf  Section  2. 3. 5.3). 

The  most  interesting  analysis  step  occurs  when  assessing  the  complexity  of  using 
the  power  method  on  lines  13-27  to  determine  the  dominant  eigenvector  of  the  quotient 
matrix,  Q.  As  described  in  Sections  2. 5.2.2  and  3.2,  given  some  precision,  r,  and  scaling 
factor,  a  >  0.5,  using  the  power  method  to  obtain  the  PageRank  vector  of  the  PageRank 

matrix,  S,  yields  the  lower  and  upper  bounds  of  Q.{n^  -logn)  and  0{n^  -t),  respectively. 

The  value  of  t  is  bounded  by  t  ^  -  log„  r,  where  /I2  (^)  denotes  the  eigenvalue 

with  the  second  largest  magnitude  and  (S)|  <  a. 
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The  power  method  is  used  to  obtain  the  dominant  eigenvector,  r,  of  the  quotient 
matrix,  Q,  where  r  is  lifted  to  obtain  the  PageRank  vector,  x,  of  the  PageRank  matrix,  S. 
Since  S  and  Q  are  nxn  and  6x6  matrices,  respectively,  where  b<n,  substitution  into 

the  bounds  Q.{n^ -Xogn^  and  0{n^  yields  Q (^6^ -log 6)  and  0{b^  respectively. 

Theorem  6  The  practical  lower  bound  on  the  execution  time  needed  to 
obtain  the  dominant  eigenvector  of  Q*’*  is  log  6. 

Proof  The  practical  lower  bound  on  the  number  of  iterations  was  based 
on  assuming  r  <  1/n .  Since  vertices  contained  in  a  given  block  have  equal 
PageRank  value,  the  maximum  precision  increases  to  r  <  1/6.  Substitution 
yields  the  new  practical  minimum  number  of  iterations,  6  =  log2 1/6 .  ■ 

The  upper  bound  on  the  number  of  power  method  iterations,  r,  needed  to  compute 

Q’s  PageRank  vector,  r,  may  be  less  than  the  bound  on  the  number  of  iterations  needed  to 

compute  S’s  PageRank  vector,  x,  since  r  <  logj^^^^  r  <  ^  ^  log« 

Theorem  7  The  upper  bound  on  the  number  of  power  method  iterations, 
r,  required  to  obtain  the  dominant  eigenvector,  x,  of  an  equitable  quotient 
matrix,  Q,  is  bounded  by  the  number  of  iterations,  t,  needed  to  obtain  the 
PageRank  vector  of  the  PageRank  matrix,  S. 

Proof  The  number  of  iterations,  r,  required  to  determine  x  is  bounded  by 
log4(Q)  ^  [GoV88].  Interlacing  ensures  X2  (Q)  =  T,  (S)  <  X2  (S)  for  some  i, 

therefore,  r  <  log|^|^j|  r  <  t  <  log|^|,||  r  <  log„  r.  ■ 

If  K(q)|  <|/l2(S)|,  the  upper  bound  on  the  number  of  power  method  iterations 
may  differ  with  respect  to  S  and  Q,  where  r  =  logj^^Q^I  t,  t  =  log|^(s)|  ^  r<t. 

In  the  worst  case,  if  the  coarsest  equitable  partition,  B,  is  a  discrete  partition,  i.e.,  if  none 
of  B's  blocks  are  composed  of  multiple  vertices,  b  =  n^  (Q)  =  ^2  (S)  =  t-  The 

converse,  is  not  true,  since  for  certain  graphs,  b<n,  yet  ^2  (Q)  =  ^2  (^)’  thus,  r  =  t. 
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Thus,  applying  power  method  iteration  to  the  quotient  matrix,  Q,  of  the  PageRank 
matrix,  S,  reduces  the  lower  and  upper  bounds  on  the  complexity  of  the  power  method  in 
two  ways.  The  matrix  size  reduction,  from  S"’"  to  Q*’*,  reduced  the  bounds  on  the  power 

method  from  Q^n^-logn)  and  0[n^ to  Q(^6^TogZ))  and  0{b^ respectively, 
where  the  performance  scales  in  proportion  to  jb^ .  The  upper  bound  on  the  number  of 
power  method  iterations,  t,  is  replaced  by  r,  yielding  0{b^  ■  r),  where  r  depends  on  Q’s, 
not  S’s,  second  dominant  eigenvalue,  i.e.,  r  <  logj^^^^  r  <  ^  ^  If  B  is  stored 

as  a  sparse  matrix,  the  lifting  step  is  0(n),  since  B  contains  n  non-zero  entries. 

Thus,  the  lower  bound  of  the  QuotientRank  algorithm  described  in  Section  5.4  is 
based  on  computing  the  coarsest  equitable  partition  and  applying  the  power  method  to  Q, 

which  are  Q.{n^  -logn)  and  Q.{b^  Togb),  respectively.  Hence,  its  overall  lower  bound  is 

Q.{n^  -Xogn  +  b^  ■  logZ)).  The  corresponding  upper  bound  on  the  QuotientRank  algorithm 

is  0{n^  ■  log  n  +  b^  -r^,  where  b<n,  r  <t,  r  <  logj^^Q^I  r,  and  t  <  log|^  ^^^1  t  <  log^  r. 

If  h  «.n,  a  QuotientRank  algorithm  implementation  that  uses  an  efficient  method 
of  computing  the  coarsest  equitable  partition  outperforms  the  PageRank  algorithm,  which 
is  -logu)  and  0{^n^  -t).  The  gains  are  proportional  to  [n^  b<n, 

r  <  log|^^Qj|  T,  t<  log  ^(s)|  ^  ^  log„  T,  and  r  <t.  The  worst  case  again  occurs  if  the  graph’s 

equitable  partition,  B,  is  discrete,  where  if  b  =  n,  the  PageRank  algorithm’s  lower  bound 
is  doubled  by  the  cost  to  find  B,  for  a  total  of  2  •  •  log«  operations.  The  QuotientRank 

algorithm  is  more  efficient,  however,  if  the  coarsest  equitable  partition  is  non-discrete. 
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5.6.  A  QuotientRank  Example 

This  example  illustrates  the  potential  performance  gain  obtained  by  applying  the 
QuotientRank  algorithm.  The  pseudo-benzene  graph  shown  in  Figure  52(a)  has  a  coarsest 

equitable  partition  of  two  blocks,  [{2,4,6,8,10,12},  {1,3, 5, 7,9,1 1}].  The  graph’s  12x12 

adjacency  and  PageRank  matrices  are  not  listed.  The  2x2  quotient  matrix  induced  by  its 
coarsest  equitable  partition  and  PageRank  matrix  is  shown  in  Table  48  and  illustrated  in 
Figure  52(b).  Finding  its  coarsest  equitable  partition  requires  as  many  as  -logj  n  (517) 
operations.  Constructing  the  PageRank  matrix,  S,  which  is  used  in  both  the  QuotientRank 
and  PageRank  algorithms,  requires  (144)  operations.  Storing  a  12x2  block  matrix,  B, 
as  a  sparse  matrix  only  requires  n  (12)  operations.  Constructing  a  2x2  quotient  matrix, 

Q  =  •  b}  •  B^  •  S  •  B,  requires  •  n  (48)  operations. 


(b)  PageRank-Induced  Quotient  Graph 


Figure  52.  Pseudo-Benzene  Graph  and  Its  PageRank-Induced  Quotient  Graph 


Table  48.  A  2x2  PageRank-Induced  Quotient  Matrix,  Q 


destination 

{2,4,6,8,10,12} 

{1,3,5,7,9,11} 

source 

{2,4,6,8,10,12} 

0 . 0750 

0 . 6417 

0 .7167 

{1,3,5,7,9,11} 

0.9250 

0.3583 

1.2833 

1 

Z 
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A  PageRank  vector  computed  to  52  bits  of  precision,  r  =  2  or  approximately 
15  decimal  digits,  correspond  to  IEEE  754  double-precision  values  [ISB85].  Applying  the 
power  method  to  the  PageRank  matrix,  S,  and  the  induced  quotient  matrix,  Q,  requires 
■t,t<  ^  and  •r,r<  log^^^^l  r,  time,  respectively.  The  eigenvalues  of  S  and  Q 

reveal  that  (S)|  =  0.85  and  (Q)|  =  0.5667,  i.e.,  the  largest  number  of  power  method 

iterations  that  must  be  performed  with  respect  to  S  or  Q  are  t  =  |” logo  gj  ^  =  222  and 

r  =  I" logo  5667  2  ]  =  64,  respectively.  Thus,  in  the  worst  case,  applying  the  power  method 

to  S  and  Q  costs  12^  •  222  =  31,968  and  2^  •  64  =  256  operations,  respectively. 

The  dominant  eigenvector,  r,  of  Q  is  shown  in  Table  49(a).  Lifting  S’s  dominant 

eigenvector,  x,  from  r,  results  in  the  vector  shown  in  Table  49(b).  Normalizing  the  vector 

by  its  sum  yields  S’s  PageRank  vector  listed  in  Table  49(c).  These  lifting  and  normalizing 

steps  require  3-n  (36)  operations,  12  to  compute  B-r  and  24  to  compute  x/^x. 

Table  49.  Lifting  the  PageRank  Vector  from  the  Dominant  Eigenvector 

(b)  x  =  B-r 

'  1  I  0.5901 

2  0.4096 

3  0.5904 

4  0.4096 

5  0.5904 

6  0.4096 

7  0.5904 

8  0.4096 

9  0.5904 

10  0.4096 

11  0.5904 

12  0.4096 
X  6.0000 


(a)  r 


1 

0.4096 

2 

0.5904 

Z 

1.0000 

(c)  x  =  x/^x 
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Summing  each  algorithm’s  number  of  operations  yields  Table  50,  which  confirms 
using  the  QuotientRank  algorithm  is  more  efficient  than  using  the  PageRank  algorithm  to 
obtain  the  PageRank  vector  of  the  graph  shown  in  Figure  52.  Although  some  overhead  is 
required  to  build  the  quotient  matrix  and  lift  the  PageRank  vector,  the  execution  time  is 
dramatically  reduced  in  proportion  to  32112/1109  «  29.  Executing  each  algorithm  shows 
the  number  of  observed  power  method  iterations  for  S  and  Q  are  60  and  62,  respectively, 
as  shown  in  Table  5 1 .  Although  the  PageRank  algorithm  executes  two  fewer  iterations,  it 
still  needs  more  execution  time  than  the  QuotientRank  algorithm,  where  8784/1 101 »  8. 

Table  50.  Theoretical  Iterations:  n  =  \2,  b  =  2,  t  =  2"^^  t  =  222,  r  =  64 
(a)  PageRank  Algorithm  (b)  QuotientRank  Algorithm 


Task 

Time 

Total 

Compute  B 

•  logj  n 

517 

Construct  S 

144 

Construct  B 

n 

12 

Construct  Q 

144 

Power  Method 

b^-r 

256 

Lift  &  Normalize  x 

3-n 

36 

z 

1109 

Task 

Time 

Total 

Construct  S 

144 

Power  Method 

-t 

31968 

I 

32112 

Table  5 1 .  Observed  Iterations:  n  =  l2,  b  =  2,  t  =  2  t  =  60,  r  =  62 

(a)  PageRank  Algorithm  (b)  QuotientRank  Algorithm 


Task 

Time 

Total 

Construct  S 

n 

144 

Power  Method 

n  -t 

8640 

I 

8784 

Task 

Time 

Total 

Compute  B 

■ log2  n 

517 

Construct  S 

n 

144 

Construct  B 

n 

12 

Construct  Q 

144 

Power  Method 

b^  -r 

248 

Lift  &  Normalize  x 

3-n 

36 

I 

1101 
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5.7.  QuotientRank  Applicability 

Section  4.3.4  informally  considered  when  the  algorithm  described  in  Section  4.3.2 
or  QuotientRank  algorithm  described  in  Section  5.4  should  be  used  to  compute  a  graph’s 
PageRank  vector  instead  of  applying  the  PageRank  algorithm.  The  material  contained  in 
this  section  provides  a  more  robust  analysis  with  respect  to  the  QuotientRank  algorithm, 
which  is  the  most  significant  new  algorithm  described  herein.  If  dense  matrices  are  used, 
with  the  notable  exception  of  the  block  matrix,  B,  which  is  stored  as  a  sparse  matrix,  the 
key  factors  in  assessing  if  the  PageRank  algorithm  or  QuotientRank  algorithm  is  the  more 
efficient  method  of  computing  the  PageRank  vector,  x,  of  an  arbitrary  graph,  G,  are: 

•  n  =  |R|  ,  number  of  vertices  contained  in  G 

•  ^  =  |5|,  number  of  blocks  contained  in  G’s  coarsest  equitable  partition,  B 

•  a,a>  0.5,  scaling  factor  used  to  construct  the  PageRank  matrix,  S 

•  T,T<\ln,  numerical  precision  of  the  PageRank  vector,  x 

•  t,t<  log^  ^^^1 T  <  log^  r,  maximum  iterations  with  respect  to  S 

•  r,r<  log|^^Q^  r  maximum  iterations  with  respect  to  quotient  matrix,  Q 

The  number  of  blocks,  b,  contained  in  the  graph’s  coarsest  equitable  partition,  B, 
is  bounded  by  the  number  of  vertices,  n,  in  G,  i.e.,  b<n.  Furthermore,  as  established  in 
Section  5.5,  the  maximum  number  of  required  power  method  iterations  with  respect  to  S 
equals  or  exceeds  the  number  of  maximum  iterations  needed  with  respect  to  Q,  i.e.,  r<t, 

since  r  <  logj^^^^  T<t<  log^(s)  ^  -  log„  r.  The  following  bounds,  in  particular,  the  lower 

bounds  are  derived  by  assuming  the  scaling  factor,  a,  used  in  the  PageRank  perturbation 
is  contained  in  the  range,  0.5  <  a  <  1  and  the  PageRank  vector  is  computed  to  a  precision 
that  nominally  provides  for  at  least  n  unique  PageRank  values,  i.e.,  T<\ln. 
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The  PageRank  algorithm’s  practical  lower  bound  on  its  required  execution  time  is 
Q.{n^  •  logn),  a  new  result  derived  in  Section  3.2.  The  analysis  in  Section  5.5  showed  the 

QuotientRank  algorithm’s  lower  bound  is  Q.{n^  -Xogn  +  b^  ■  logh).  The  number  of  blocks, 
b,  contained  in  the  coarsest  equitable  partition,  B,  is  bounded  by  the  number  of  vertices, 
n  =  \V,\  where  b<n.  The  coarsest  equitable  partition  can  be  found  in  @{n^  •  logn)  time. 

In  the  worst  case,  if  G  yields  a  discrete  coarsest  equitable  partition,  i.e.,  if  b  =  n, 
the  QuotientRank  algorithm  increases  the  PageRank  algorithm’s  lower  bound  by  a  factor 
of  two,  where  -Xogn  +  b^  -Xogb  =  2- ■  log  n.  The  impact  of  this  worst-case  scenario  is 
offset  by  noting  two  more  significant  results.  First,  the  QuotientRank  algorithm  decreases 
the  execution  time  required  to  determine  the  PageRank  vector  of  those  graphs  that  yield  a 
non-discrete  coarsest  equitable  partition,  i.e.,  if  b<n.  For  such  graphs,  the  QuotientRank 
algorithm  also  ensures  vertices  contained  in  the  same  block  have  equal  PageRank  values. 

The  PageRank  algorithm’s  upper  bound  is  0[n^  t),  where  t  <  log^  ^^^1  r  <  log^  r, 

as  shown  in  Section  2. 5.2.2.  The  analysis  in  Section  5.5  establishes  that  the  QuotientRank 
algorithm’s  upper  bound  is  0{n^  ■  log  n  +  b^  -r^,  where  r  <  logj^^Q^I  T<t.  This  bound  also 

degenerates  in  the  worst-case,  i.e.,  if  the  coarsest  equitable  partition  is  discrete,  where 
b  =  n  yields  0{n^  -Xogn  +  n^  Thus,  in  the  worst  case,  the  QuotientRank  algorithm 

may  need  ■  log  n  more  time  than  the  PageRank  algorithm.  However,  the  QuotientRank 
algorithm  outperforms  the  PageRank  algorithm  if  the  graph’s  coarsest  equitable  partition 
is  non-discrete,  i.e.,  contains  sufficiently  fewer  blocks  than  the  graph  contains  vertices. 
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The  upper  bounds  on  the  PageRank  and  QuotientRank  algorithm’s  execution  time 
are  0{n^  and  0{n^  -  Xogn  +  b^  -r^,  respectively.  The  -t  and  -r  terms  denote  the 

time  needed  to  apply  the  power  method  to  the  PageRank  and  quotient  matrices,  S  and  Q, 
respectively.  Thus,  the  QuotientRank  algorithm  outperforms  the  PageRank  algorithm  if 

{n^  -t -b^  ■r^>n^  ■  log n,  where  ■  log n  denotes  the  time  needed  to  obtain  the  coarsest 
equitable  partition.  The  anticipated  reduction  in  execution  time  is  {n^  ■^)/(^^  where 
r<t  and  b<n.  Although  r  <  log|^^Q^|  T<t<  t  <  log^  r,  preliminary  data  suggests 

(r  w  t)  <  logpj^^Qji  T  <  log|^(s)  ^  ^  loga  ^5  reducing  the  relative  performance  gain  to  n^jb^ . 

If  b  is  sufficiently  small  relative  to  n,  i.e.,  if  a  coarsest  equitable  partition  contains 
fewer  blocks  than  vertices,  the  QuotientRank  algorithm  obtains  a  PageRank  vector  in  less 
time  than  the  PageRank  algorithm.  However,  6’s  value  can  only  be  known  by  computing 
the  coarsest  equitable  partition,  which  requires  -logn  time.  Graphs  containing  regular 
structure  yield  such  gains,  e.g.,  the  grid  graph  depicted  in  Figure  53. 

These  relative  performance  gains  increase  as  the  number  of  blocks  decreases  with 
respect  to  the  number  of  vertices,  i.e.,  as  b  decreases  with  respect  to  n.  For  instance,  many 
trees  have  a  coarsest  equitable  partition  containing  fewer  blocks  than  vertices.  Finally,  at 
least  certain  web  graphs  yield  a  coarsest  equitable  partition  containing  fewer  blocks  than 
vertices  [BLS+06]. 


Figure  53.  8x3  Grid:  A  Graph  Yielding  a  Non-Discrete  Coarsest  Equitable  Partition 


119 


VI.  Conclusions  and  Future  Research 


6.1.  Conclusions 

6.1.1.  Complexity  Bounds 

The  first  new  result,  which  was  derived  in  Section  3.2,  is  a  practical  lower  bound 
on  the  number  of  power  method  iterations  that  is  obtained  by  applying  recent  work  that 
defined  an  upper  bound  [HaK03,  LaM06].  The  lower  bound  is  derived  by  applying  two 
assumptions  related  to  the  required  precision,  r,  and  PageRank  scaling  value,  a.  The  first 
assumption  is  r  <  Xjn ,  where  r  =  1/n  is  the  largest  precision  that  can  provide  n  unique 
PageRank  values,  i.e.,  one  value  for  each  of  the  n  vertices.  Assuming  a  >  0.5  includes 
the  suggested  default  PageRank  scaling  value,  a  =  0.85.  Applying  these  two  assumptions 
and  applying  base-2  logarithms  yields  a  practical  lower  bound  on  the  number  of  required 
power  method  iterations,  t,  needed  to  compute  the  PageRank  vector,  where 


log,  1/n  ,  ^  ^1  log,  r 

'  =\og,n<t<\oz^T= 


(40) 


log2  0.5  log,  a 

The  practical  lower  bound,  log,  n,  matches  experimental  results  reported  by  the 


PageRank  algorithm’s  creators  [PBM-i-98]  and  other  researchers  [ANT-i-02].  Each  power 
method  iteration  performed  in  the  PageRank  algorithm  multiplies  an  n  x  n  matrix  with  an 
«  X 1  vector.  If  dense  matrices  are  used,  the  lower  and  upper  bounds  on  the  time  needed  to 

determine  the  PageRank  vector  are  Q.{n^ -Xogn^  and  O(n^  log^r),  respectively,  where 


logn  and  log^  t  are  the  minimum  and  maximum  required  number  of  iterations.  If  sparse 
matrices  are  being  used,  the  lower  and  upper  bounds  are  Q(m  •  logn)  and  0{m-  log^  r), 
respectively,  where  m  =  \E\  denotes  the  number  of  edges  contained  in  the  graph. 
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6.1.2.  Obtaining  Canonical  Vertex  Orderings  from  the  PageRank  Vector 

Another  result,  which  is  not  known  to  be  formally  stated  elsewhere  with  respect  to 

the  PageRank  algorithm,  is  related  to  the  concept  of  finding  a  canonical  vertex  ordering. 
Assuming  sufficient  computing  resources  exist,  the  two  methods  described  in  Section  1 .2 
can  obtain  a  canonical  order,  where  the  first  method  only  uses  the  PageRank  algorithm. 
For  example,  the  mansion  graph  illustrated  in  Figure  54(a)  yields  a  PageRank  vector  that 
induces  a  canonical  vertex  order,  as  shown  in  Figure  54(b). 

However,  graphs  relevant  to  the  work  described  herein  yield  non-discrete  coarsest 
equitable  partitions.  Thus,  their  PageRank  vectors  contain  at  least  two  equal  entries  and 
cannot  induce  a  canonical  ordering.  For  such  graphs,  a  canonical  ordering  can  be  found 
using  an  application  that  computes  a  canonical  isomorph,  e.g.,  nauty,  and  applying  the 
PageRank  algorithm  to  the  canonical  isomorph.  For  example,  the  canonical  isomorph  that 
is  yielded  by  applying  nauty  to  house  graph  is  shown  in  Figure  54(c),  where  its  PageRank 
vector  induces  the  canonical  vertex  ordering  illustrated  in  Figure  54(d). 

However,  finding  a  canonical  isomorph  may  require  exponential  time,  therefore, 
the  latter  approach  only  succeeds  if  it  is  applied  to  sufficiently  small  and  irregular  graphs. 
For  instance,  nearly  every  random  graphs  and  random  trees  is  sufficiently  irregular,  even 


if  their  coarsest  equitable  partition  is  not  discrete. 


(c)  House  Graph  (d)  PageRank  Order 


Figure  54.  Canonical  PageRank  Orderings  of  the  Mansion  and  House  Graphs 
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6.1.3.  Relating  the  PageRank  Vector  and  the  Coarsest  Equitable  Partition 

The  most  theoretical  contribution  constructed  herein  is  contained  in  Section  3.3, 

where  it  is  shown  that  vertices  contained  in  the  same  block  of  a  graph’s  coarsest  equitable 
partition  must  have  equal  iterated  dot  products.  Since  a  PageRank  vector  can  be  obtained 
by  applying  the  power  method,  which  simply  computes  an  iterated  dot  product,  vertices 
contained  in  the  same  block  yield  equal  PageRank  values,  as  described  in  Section  3.4. 
This  result  mirrors  another  proof  of  the  relationship  between  the  PageRank  vector  and  the 
coarsest  equitable  partition  [BLS+06].  Boldi  et  al.'s  earlier  proof  uses  tools  from  category 
theory,  e.g.,  the  graph’s  minimum  base  and  its  fibrations,  which  correspond  to  the  graph’s 
coarsest  equitable  partition  and  its  blocks. 

Both  of  the  proofs  suggest  many  methods  of  improving  the  PageRank  algorithm’s 
performance.  The  first  improvement  is  obtained  by  the  AverageRank  algorithm  listed  in 
Section  4.2,  which  uses  the  average  block  PageRank  values.  That  method’s  lower  bound 

is  the  same  as  the  PageRank  algorithm’s,  -logn),  where  n  denotes  the  number  of 

vertices.  However,  its  upper  bound  is  increased  from  0{^n^  to  0{n^  -logn  +  n^ 

where  t  <  log|^(s)|  ^  log^  ^  and  S  denotes  the  PageRank  matrix. 

That  method  is  superseded  by  the  ProductRank  algorithm  listed  in  Section  4.3,  in 
which  one  PageRank  value  is  computed  for  each  block  in  the  coarsest  equitable  partition. 
The  ProductRank  algorithm’s  lower  bound  is  also  the  same  as  the  PageRank  algorithm’s 

lower  bound,  Q.{n^  -Xogn^.  More  importantly,  the  upper  bound  reduces  from  0{n^ to 

O  {n^  ■  log  n  +  b-n-t^,  where  b  denotes  the  number  of  blocks.  The  ProductRank  algorithm 
ensures  vertices  contained  in  the  same  block  have  the  same  PageRank  value. 
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Both  algorithms  are  superseded  by  the  proof  provided  in  Section  5.2  that  shows  a 
certain  quotient  matrix  induced  by  the  coarsest  equitable  partition  can  be  used  to  compute 
the  PageRank  vector.  That  proof  is  the  basis  of  the  QuotientRank  algorithm  described  in 
Section  5.4.  As  shown  in  Section  5.5,  the  lower  and  upper  bounds  of  the  QuotientRank 

algorithm  are  Q^n^-logn)  and  0{n^  -Xogn  +  b^  respectively,  where  b  denotes  the 

number  of  blocks  contained  in  the  coarsest  equitable  partition.  The  value  of  r  is  based  on 
the  second  eigenvalue  of  the  quotient  matrix,  Q,  the  precision,  r,  and  scaling  factor,  a, 
such  that  r  <  logj^^Q^i  T<t<  logj^^^^i  r  <  log„  r. 

The  QuotientRank  algorithm  further  extends  [BLS+06],  which  also  showed  that 
the  quotient  matrix  can  be  used  to  reduce  the  time  needed  to  obtain  the  PageRank  vector. 
However,  that  proof’s  authors  did  not  develop  or  analyze  a  method  based  on  the  quotient 
matrix.  As  established  in  Sections  5.5-  5.7,  the  QuotientRank  algorithm’s  upper  bound  is 

0{n^  -  Xogn  +  b^  whereas  the  PageRank  algorithm’s  upper  bound  is  0{n^  -tj.  Thus, 
if  -t^-ib^  -r)  >  -logn,  the  potential  performance  gain  is  {n^  that 

In  sum,  PageRank  vectors  of  graphs  containing  at  least  some  regular  structure  can 
be  obtained  in  less  time  using  the  ProductRank  and  QuotientRank  algorithms  described  in 
Sections  4.3  and  5.2,  respectively.  The  relative  gain  is  based  on  the  number  of  blocks  in 
the  coarsest  equitable  partition  and  number  of  vertices,  such  that  the  performance  gain  is 

-t^jib^  •?')•  Preliminary  data  suggests  (r  « t)  <  log|^(s)|  reducing  the  gain  to  n^jb^. 
Such  gains  can  be  obtained  on  some  web  graphs  [BLS+06],  many  trees,  and  grid  graphs. 
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6.2.  Future  Work 

6.2.1.  Implementation  Improvements 

Many  of  the  algorithms  described  herein  were  implemented  in  MATLAB  to  verify 
their  correctness.  A  key  area  of  future  work  is  to  design  robust  implementations,  e.g.,  the 
methods  described  in  Sections  2. 3. 3.2  and  5.2.  For  instance,  the  current  implementation 
find  the  coarsest  equitable  partition  using  1-D  Weisfeiler-Lehman  stabilization,  due  to  its 
relevance  to  the  results  described  herein.  However,  the  method  listed  in  Section  2. 3. 3. 2  is 
more  efficient. 

The  quotient  matrix  can  be  constructed  more  efficiently,  as  noted  in  Section  5.5. 
The  current  versions  also  use  dense  matrices,  but  sparse  matrices  must  be  used  to  process 
large  graphs,  e.g.,  web  graphs.  Adding  sparse  matrix  support  also  motivates  supporting 
personalization  vectors,  where  provided  a  probability  vector,  v,  the  PageRank  matrix,  S, 
yielded  by  applying  the  PageRank  perturbation  is  (cf.  Section  2.5.3) 

S  =  a-A-D-‘+(l-a)-v-l‘’”.  (41) 

This  modification  adjusts  the  PageRank  of  certain  vertices,  e.g.,  to  decrease  the  effect  of 
“spamming  done  by  the  so-called  link  farms”  [LaM06].  Personalization  may  decrease  the 
effectiveness  of  applying  the  graph’s  coarsest  equitable  partition,  since  a  personalization 
vector  may  increase  the  number  of  blocks  contained  in  the  coarsest  equitable  partition. 

A  robust  implementation  should  also  leverage  parallel  hardware,  e.g.,  multi-core 
processors.  The  power  method  can  be  executed  in  parallel,  since  PageRank  matrix  rows 
can  be  independently  multiplied  by  a  PageRank  vector.  Weisfeiler-Lehman  stabilization 
can  be  implemented  in  parallel,  but  similarly  implementing  the  algorithm  for  finding  the 
coarsest  equitable  partition  described  in  Section  2. 3. 3. 2  in  parallel  requires  more  effort. 
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6.2.2.  Other  Linear  Algebra  Applications 

Throughout  this  research,  it  was  assumed  the  PageRank  vector  is  being  computed 

by  applying  the  power  method  to  the  stochastic  PageRank  matrix,  S,  or  its  corresponding 
quotient  matrix,  Q.  For  instance,  given  Q,  and  the  normalized  all-ones  vector,  r  = 


the  PageRank  vector  is  obtained  by  iteratively  computing 


r^S  r 
r^r/^r 


(42) 


until  r  converges  to  the  required  numerical  precision,  r.  Given  Q’s  dominant  eigenvector. 


r,  the  PageRank  vector,  x,  is  lifted  and  normalized  using  the  block  matrix,  B,  where 


x<— Br 
x<-x/2]x. 


(43) 


One  conjecture  is  that  a  similar  process  reduces  the  time  needed  to  perform  other 


linear  algebra  tasks.  For  example,  given  an  arbitrary  linear  system. 


A  •  X  =  b ,  (44) 

the  solution  vector,  x,  preliminary  work  suggests  x  can  be  obtained  by  solving  the  system, 

Q • r  =  c ,  (45) 

where  Q  denotes  A’s  quotient  matrix  and  c  denotes  the  associated  elements  in  b.  Solving 
for  r  and  applying  the  lifting  identity  and  block  matrix,  B  to  r,  yields  the  solution  vector, 
X,  where  again,  x  <—  B  •  r. 

A  key  nuance  is  that  the  coarsest  equitable  partition  must  be  found  with  respect  to 
A  and  each  unique  vector,  b.  Loosely  stated,  b  similarly  partitions  A  in  the  same  way  the 
personalization  vector,  v,  partitions  the  PageRank  matrix  (cf.  Sections  2.5.3  and  6.2.1). 
This  approach  may  decrease  the  time  needed  to  find  x  in  applications  that  require  solving 
structured  linear  systems,  e.g.,  certain  linear  regression  or  finite  element  systems. 


125 


6.2.3.  k-D  Weisfeiler-Lehman  Stabilization 

The  coarsest  equitable  partition  is  the  most  refined  partition  that  can  be  obtained 

by  only  considering  the  neighbors  of  each  vertex.  Thus,  the  coarsest  equitable  partition  is 
obtained  by  applying  information  in  the  first  dimension,  which  contains  the  current  labels 
of  all  adjacent  vertices.  This  concept  gives  rise  to  the  method  listed  in  Section  2. 3. 3. 3  for 
finding  the  coarsest  equitable  partition,  1-D  Weisfeiler-Lehman  stabilization.  The  concept 
also  can  be  generalized  to  k  dimensions,  or  k-D  Weisfeiler-Lehman  stabilization  [Bas02]. 
Although  k-D  stabilization  was  thought  to  decide  graph  isomorphism  in  polynomial  time, 
both  parts  of  that  conjecture  have  been  shown  to  be  false  [Fur87,  CFI92,  FiirOl]. 

The  k-D  equitable  partition  is  often  more  refined  than  the  1-D  equitable  partition. 
Since  the  k-D  partition  may  contain  more  blocks  than  the  1-D  partition,  it  yields  fewer 
performance  gains  with  respect  to  the  PageRank  algorithm  or  the  future  work  described 
in  Section  6.2.2.  However,  graphs  that  yield  finer  partitions  based  on  a  value  of  k  relative 
to  smaller  values  of  k  are  useful  for  assessing  algorithms  that  decide  graph  isomorphism. 
Although  a  library  of  graphs  strictly  based  on  k  does  not  exist,  the  libraries  constructed  by 
Weisfeiler  [Wei76],  Mathon  [Mat78],  and  Junttila  and  Kaski  [JuK07]  are  commonly  used. 
An  avenue  of  future  research  is  to  construct  a  graph  library  strictly  based  on  values  of  k. 

In  addition,  the  proof  developed  in  Chapter  3  can  be  extended  to  show  that  some 
linear  algebra  methods  of  determining  isomorphism  are  no  more  powerful  than  applying 
k-D  stabilization.  For  instance,  one  avenue  of  work  described  elsewhere  considered  using 
the  matrix  inverse  to  determine  graph  isomorphism  [AMB+07].  The  proof  constructed  in 
Chapter  3  can  be  extended  to  show  the  matrix  inverse  does  not  provide  more  information 
than  the  equitable  partition  yielded  by  applying  2-D  Weisfeiler-Lehman  stabilization. 
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6.2.4.  Open  Call  for  Parallel  Software  that  Decides  Graph  Isomorphism 

Many  algorithms  decide  graph  isomorphism  for  either  a  restricted  set  of  graphs  or 

two  arbitrary  graphs  (cf  Section  2. 3.4.2).  Algorithms  that  decide  graph  isomorphism  can 
be  classified  into  two  groups,  where  an  algorithm  either  computes  an  explicit  permutation 
between  two  graphs  or  computes  a  canonical  certificate  of  each  graph,  i.e.,  two  graphs  are 
isomorphs  if  some  permutation  between  their  vertices  exists  or  their  canonical  certificates 
are  identical.  Both  types  of  algorithms  apply  graph  invariants,  e.g.,  the  coarsest  equitable 
partition,  to  prune  the  search  tree.  For  example,  the  classic  algorithm  developed  by  Bahai, 
Erdos,  and  Selkow  uses  an  invariant  that  can  be  computed  in  linear  time  [BES80].  Their 
algorithm  yields  a  canonical  isomorph  of  nearly  all  random  graphs. 

More  robust  algorithms  decide  graph  isomorphism  between  two  arbitrary  graphs. 
Some  readily  accessible  software  tools  include  Boost  [BST],  Groups  &  Graphs  [KoK06], 
EEDA  [LED],  Mathematica  [MTH],  and  NetworkX  [HSS].  Other  algorithms  that  decide 
graph  isomorphism  include  VF2  [CFS+04],  Bliss  [JuK07],  and  ScrewBox  [KuS07].  The 
VF2  algorithm  is  the  most  modem  algorithm  that  decides  graph  isomorphism  by  finding 
a  permutation  between  two  input  graphs  instead  of  finding  their  canonical  isomorphs. 

The  standard  algorithm  used  to  decide  graph  isomorphism,  nauty,  includes  many 
other  isomorphism-related  functions  [McK81,  McK04].  Software  tools  that  require  such 
functions  often  interface  with  nauty,  such  as  GAP’s  GRAPE  package  [GAP,  Soi06]  and 
MAGMA  [MAG].  MATLAB’s  documentation  hints  that  its  bioinformatics  package  links 
with  nauty  [MAT].  Nauty'?,  widespread  usage  can  be  attributed  to  many  factors,  e.g.,  its 
performance  and  function  set,  which  includes  functions  for  finding  canonical  isomorphs, 
orbit  partitions,  quotient  graphs,  and  automorphism  groups. 
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Unfortunately,  nauty  does  have  some  shortcomings,  prompting  this  open  call  for 
improvements  to  nauty  or  an  alternative  tool  that  implements  its  functions  and  addresses 
its  shortcomings.  For  example,  nauty  is  freely  accessible,  however,  its  source  code  is  not 
open-source.  The  only  known  open-source  package  that  is  similar  to  nauty  is  the  N.I.C.E. 
library  and  that  was  recently  written  for  the  SAGE  mathematics  package  [SAG,  Mil07]. 
The  N.I.C.E.  library  applies  techniques  used  in  nauty  and  summarized  in  Section  2. 3.4.2. 
The  N.I.C.E.  library  can  be  integrated  in  military  applications,  whereas  nauty'?,  copyright 
statement  currently  precludes  its  use  in  military  applications  [McK04]. 

Currently,  N.I.C.E.  is  the  only  known  alternative  to  nauty  that  provides  a  similar 
set  of  functions,  such  as  obtaining  canonical  isomorphs  and  orbit  partitions.  N.I.C.E.  also 
appears  to  support  non-simple  graphs,  such  as  weighted  and  directed  graphs.  In  contrast, 
nauty'?  documentation  describes  how  to  transform  theses  graphs  for  processing  [McK04]. 
However,  N.I.C.E.’s  stated  objectives  are  to  provide  an  open-source  and  understandable 
alternative  to  nauty,  i.e.,  N.I.C.E.  is  not  designed  to  provide  the  raw  speed  with  respect  to 
deciding  graph  isomorphism  that  is  presently  yielded  by  nauty. 

Hence,  the  most  ambitious  future  goal  is  to  construct  an  open-source  software  tool 
that  finds  canonical  isomorphs  and  orbit  partitions  and  whose  performance  is  competitive 
with  nauty'?  performance.  This  application  would  leverage  the  multi-core  processors  and 
other  parallel  tools  available  in  modem  computing  devices.  For  instance,  this  application 
could  apply  the  method  used  to  find  the  coarsest  equitable  partition  to  be  implemented  in 
the  parallel  algorithm  for  determining  the  PageRank  vector,  as  described  in  Section  6.2.1. 
A  software  tool  that  decides  graph  isomorphism  in  parallel  can  benefit  many  applications, 
to  include  work  related  to  UAV  swarms,  as  described  in  Sections  2.1,  2.2.2,  and  6.1.2. 
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6.3.  Summary 

The  PageRank  algorithm  is  often  used  to  order  query  responses,  e.g.,  web  pages 
matching  some  search  criteria.  However,  the  PageRank  algorithm  has  other  applications. 
For  instance,  the  results  described  herein  were  motivated  by  exploring  how  the  PageRank 
algorithm  or  tools  used  to  decide  graph  isomorphism  could  be  applied  to  sensor  networks, 
e.g.,  UAV  swarms.  One  natural  problem  to  consider  orders  the  nodes  based  on  some 
measure  of  importance.  If  an  attacker  knows  the  F',  2”^*, ...,  n*  most  critical  nodes,  the 
attacker  can  determine  which  nodes  to  attack  first.  A  similar  application  is  finding  nodes 
that  can  facilitate  spreading  (mis)information,  or  in  social  networks,  diseases  and  rumors. 
In  general,  the  PageRank  algorithm  is  useful  for  analyzing  scenarios  that  require  knowing 
the  probable  behavior  of  an  object  that  traverses  the  underlying  graph. 

The  results  described  in  Chapters  3-5  improve  the  performance  of  the  PageRank 
algorithm  in  several  ways,  e.g.,  if  it  is  being  applied  to  networks  such  as  UAV  swarms. 
For  instance,  the  practical  lower  bound  derived  in  Section  3.2  provides  the  minimum  time 
needed  to  obtain  the  PageRank  vector  of  an  arbitrary  UAV  swarm.  The  proof  described  in 
Sections  3.3  and  3.4  establishes  it  is  possible  to  easily  identify  nodes  that  must  have  equal 
PageRank  values  by  applying  the  graph’s  coarsest  equitable  partition. 

The  AverageRank  algorithm  listed  in  Section  4.2  ensures  such  nodes  have  equal 
PageRank  values,  even  if  a  PageRank  vector  is  numerically  computed.  The  ProductRank 
algorithm  listed  in  Section  4.3  finds  the  PageRank  vector  of  graphs  containing  such  nodes 
more  efficiently.  Both  methods  are  superseded  by  the  QuotientRank  algorithm  listed  in 
Section  5.4,  the  most  efficient  algorithm  described  herein.  Finally,  these  results  generated 
many  promising  avenues  of  future  research,  as  described  in  Section  6.2. 
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